Development  of  a  Taxonomy  of  Human  Performance: 


The  Task  Characteristics  Approach  te  Performance  Prediction 

3i 


i-f 

SC 

CO 

Q 


Alfred  L  Farina 
Bear#  A.  Wheaton 


Techni*  jl  Report  7 
FEBRUARY  !97t 


! 


AMERICAN  INSTITUTES  FOR  RESEARCH 

WASHINGTON  OFFICE 

Address  8555  S/xtot-nth  Streel  Stiver  Spring,  Maryland  20910 
Telephone  HOI  I  58*8201 


Hj 


R71-6 


AMERICAN  INSTITUTES  FOR  RESEARCH 

WASHINGTON.  D.C. 

esfiau  a.  flesshmam.  pto>.  director 

Alters  S.GSARa«,f!fc'j.  Dtpusy  JSf«c;cf 


INSTITUTE  FOR  COMMUNICATION  RESEARCH 

Ailfctir  L.  KoroKu*.  PhD.  Director 

R«S3£rch  on  instructional,  cammunicaiun.  and  information  systems  and  their  effective¬ 
ness  in  meeting  individual  and  social  needs. 


INSTITUTE  FOR  RESEARCH  ON  ORGANIZATIONAL  BEHAVIOR 

Clifford  P.Hahr.,  MS.  Director 

Research  on  human  resources,  selection  ai  i  training,  management  and  organization, 
safety,  and  administration  of  justice. 


UNCLASSIFIED _ 

Ss  wiij  CbHifictiion 


«  Oft'OI'IATiNG  *CTrVITT  |'C 


American  Institutes  for  Research 
135  North  Eellefield  Avenue 


OOCUMEHT  CONTROL  DATA -RAD 

IfMWKl  *»  — —— »  nyi'V'  ZJ^U^SSAm, 

a*.  nceoirr  »«cumTv  c>,  .ur-^tAi  t&u 
UNCLASSIFIED 


».  »e*oat  title 


DEVELOPMENT  OF  A  TAXONOMY  OF  HUMAN  PERFORMANCE:  THE  TASK  CHARACTER I SfICS  APPROACH 
TO  PERFORMANCE  PREDICTION 


4.  OlICR^TfVE  MOTES  <Tfp+  #1  smpe tt  m*4  UkIvii**  4m—) 

Interim  Technical  Report 


1  AUTHOR! $1  f fw*l  MM.  ml  44 1  •  tntiUtJ.  l—t  mmmm) 


Alfred  J.  Farina  and  George  R.  Wheaton 


*.  REPORT  OATE 

February  1971 

CONTRACT  or  grant  no. 

F44620-67-C-0116 

(AFOSR) 

(Army- BESRL) 

c  ARPA  Order  Number 

1032 

ARPA  Order  Number 

4. 

1623 

I*! 


W.  ORIGINATOR'S  REPORT  NUM«CR(S> 


AIR-726/2035- 2/71 -TR7 


»fc.  OTHER  REPORT  N04S)  (A*/  Mh*r  mnW*  dMf  «m/  A* 
(lila  report) 


BESRL  Research  Study  71-  7 


tO.  OISTttlEuTION  STATEMENT 


This  document  has  been  approved  for  public  release  and  sale; 
its  distribution  is  unlimited. 


It.  SUPPLEMENTARY  NOTES  It.  SPONSORING  MILITARY  ACTIVITY 

Advanced  Research  Projects  Agency 
Department  of  Defense,  Washington,  D.C. 


IS.  AESTNACT  j 

The  development  and  evaluation  of  systems  for  describing  and  classifying  tasks  which 
can  improve  generalization  of  research  results  about  human  performance  is  essential  for 
organizing,  communicating,  and  implementing  these  research  findings.  The  present  report 
describes  research  undertaken  to  develop  one  such  system--a  task  characteristics  approac 
Basic  objectives  were  to  develop  descriptive  characteristics  of  tasks;  assess  the  reli¬ 
ability  of  rating  scales  devised  to  measure  these  characteristics;  and  determine  if  thes 
characteristics  represented  correlates  of  performance. 

Major  components  of  a  task  were  identified  and  treated  as  categories  within  which  to 
devise  task  characteristics  or  descriptors.  Each  characteristic  was  cast  into  a  rating 
scale  format  which  presented  a  definition  of  the  characteristic  and  a  seven-point  scale 
with  defined  anchor-  and  mid-points  along  with  examples  for  each  point.  Nineteen  scales 
were  developed  and  evaluated  in  a  series  of  3  reliability  studies.  In  general,  it  was 
found  that  a  subset  of  scales  having  adequate  reliability  consistently  emerged  in  all  3. 

"Post-diction"  was  the  paradigm  used  to  determine  whether  the  task  characteristics 
were  correlates  of  performance  on  which  predictive  relationships  might  be  established. 
Performance  measures  were  abstracted  from  studies  already  existing  in  the  literature. 
Two  post-diction  studies  were  conducted- -the  first  involved  6  scales  and  26  tasks,  the 
second  involved,  6  scales  and  20  tasks--with  encouraging  results. 

Significant  multiple  correlations  of  .82  and  .73  were  obtained  between  task  charac¬ 
teristic  ratings  and  the  performance  measures.  It  appears  possible  to  describe  tasks  in 
terms  of  a  task-characteristics  language  that  is  relatively  free  of  the  subjective  and 
indirect  descriptors  found  in  many  other  systems.  Task  characteristics  may  represent  i 
ortant  correlates  of  performance;  as  shown  here,  it  was  possible  to  describe  subtle  di 


DD  /r..1473  eitm-es  among  ™SKS  an™^e  t0  perf0^??8„esyar- 

Secutlty  CUttiflCiitlon 


[•«. 


Task  Taxonosy 
Hunan  Perforaance 


MY  aoxos 


Task  Characteristics 
Multiple  Regression 
Model  of  Perforiunce 


AIR-726/20  :;S-2/?l-TR7 


1“ 

DEVELOPMENT  OF  A  TAXONOMY  OF  HUMAN  PERFORMANCE: 


THE  TASK  CHARACTERISTICS  APPROACH 
TO  PERFORMANCE  PREDICTION 


Alfred  J.  Farina 
George  R.  Wheaton 


TECHNICAL  REPORT  7 


Prepared  under  Contract  for 
Advanced  Research  Projects  Agency 
Department  of  Defense 
ARPA  Orders  No.  1032  and  1623 


Principal  Investigator:  Edwin  A.  Fleishman 

Conti  act  Nos.  F44620-67-C-0116  (AFOSR) 
DAHC19-71-C-0004  (ARMY-BESRL) 


American  Institutes  for  Research 
Washington  Office 


February  1971 

Approved  for  public  ref  mm;  distribution  unlimited. 


PREFACE 


The  AiR  Taxonomy  Project  was  initiated  as  a  basic  research  effort 
in  September  1967,  under  a  contract  with  the  Advanced  Research  Projects 
Agency,  in  response  to  long-range  and  pervasive  problems  in  a  variety  of 
research  and  applied  areas.  The  effort  to  develop  ways  of  describing 
and  classifying  tasks  which  would  improve  predictions  about  factors  af¬ 
fecting  human  performance  in  such  tasks  represents  one  of  the  few 
attempts  to  find  ways  to  bridge  the  gap  between  research  on  human  per¬ 
formance  and  the  applications  of  this  research  to  the  real  world  of 
personnel  and  human  factors  decisions. 

The  present  report  is  one  of  a  series  which  resulted  from  work 
undertaken  during  the  first  three  years  of  project  activity.  In  1970, 
monitorship  of  the  project  was  transferred  from  the  Air  Force  Office  of 
Scientific  Research  (AFOSR)  to  the  U.  S.  Army  Behavior  and  Systems  Re¬ 
search  Laboratory  (BESRL) ,  under  a  new  contract.  This  report,  completed 
under  the  new  contract,  is  among  several  describing  the  previous  devel¬ 
opmental  work.  It  is  also  being  distributed  separately  as  a  BESRL  Re¬ 
search  Study. 


EDWIN  A.  FLEISHMAN 
Senior  Vice  President  and 
Director,  Washington  Office 
American  Institutes  fcr  Research 


FOREWORD 


The  American  Institutes  for  Research  is  engaged  in  a  research 
program  to  develop  and  evaluate  new  systems  for  describing  and  classify¬ 
ing  tasks  which  can  improve  generalization  of  research  results  about 
human  performance  and  to  develop  a  common  language  for  researcher- 
decision  maker  communication  that  would  help  organize  human  performance 
information  for  maximum  use  in  training,  equipment  design,  and  personnel 
selection. 

The  objective  of  this  program  is  to  develop  theoretically-based 
language  systems  (taxonomies)  which--when  merged  with  appropriate  sets 
of  decision  logic  and  appropriate  sets  of  quantitative  data--can  be  used 
to  make  improved  predictions  about  human  performance.  Such  taxonomies 
should  be  useful,  for  example,  when  future  management  information  and 
decision  systems  are  designed  for  Army  use. 

During  previous  project  years,  three  different  taxonomic  systems 
were  developed,  each  of  which  seemed  to  have  maximum  relevance  for  a 
different  type  of  application:  the  ability-requirement  approach;  the 
task  characteristics  approach;  and  a  third  approach  based  on  information 
theory. 

The  present  publication  reports  on  the  development  and  preliminary 
assessment  of  the  task  characteristics  approach  to  the  prediction  of 
human  performance.  The  approach  seeks  to  describe  tasks  in  terms  ot  a 
task-oriented  language  which,  when  combined  with  multiple  regression 
techniques,  can  be  used  to  predict  task  performance. 


J.  E.  UHLANER,  Directoi 
U.  S.  Army  Behavior  and  Systems 
Research  Laboratory 


ACKNOWLEDGMENT 


Conduct  of  a  study  of  this  type  necessarily  involved  the  efforts  of 
many  people  in  addition  to  the  authors.  We  wish  to  acknowledge  the 
overall  support  and  guidance  of  the  Principal  Investigator,  Dr.  Edwin  A. 
Fleishman.  Dr.  Fleishman  was  particularly  helpful  in  providing  full 
access  to  the  data  of  several  of  his  past  studies  which  permitted  both 
the  reliability  and  post-diction  efforts  to  be  accomplished  most  effi¬ 
ciently. 

The  authors  are  particularly  indebted  to  Dr.  William  J.  Baker, 
formerly  of  AIR,  who  developed  much  of  the  rationale  underlying  the 
predictive  model  used  in  this  study.  Our  thanks  are  also  extended  to 
Susan  Emery  for  her  many  contributions  to  the  post-diction  efforts,  and 
to  Norma  Lee  for  her  able  assistance  during  the  reliability  studies. 


Alfred  J.  Farina 
George  R.  Wheaton 


DEVELOPMENT  OF  A  TAXONOMY  OF  HUMAN  PERFORMANCE:  THE  TASK  CHARACTERISTICS 
APPROACH  TO  PERFORMANCE  PREDICTION 

BRIEF 


Requirement: 

Of  the  many  conditions  which  can  influence  human  performance,  the 
most  poorly  described  and  least  understood  are  those  embodied  in  the  task. 
As  a  consequence,  the  ability  to  relate  performance  observed  in  one  task 
to  that  observed  in  other  tasks  is  limited.  The  present  research  describes 
a  series  of  studies  conducted  to  develop  an  instrument  in  terms  of  which 
the  stimulus,  procedural,  and  response  characteristics  of  tasks  could  be 
described.  It  discusses  additional  studies  which  were  designed  to  deter¬ 
mine  whether  dimensions  comprising  the  descriptive  language  represented 
correlates  of  human  performance. 

Procedure : 

The  basic  steps  in  this  research  were  to:  (a)  develop  descriptive 
characteristics  of  tasks;  (b)  assess  the  reliability  of  rating  scales 
devised  to  measure  these  characteristics;  and  (c)  determine  if  these 
characteristics  represented  correlates  of  performance. 

The  overall  direction  taken  by  the  project  was  influenced  by  a 
heuristic  model  which  viewed  performance  as  a  function  of  three  sets 
of  antecedant  conditions:  the  operator,  the  environment,  and  the  task. 

A  decision  was  made  to  focus  initial  efforts  on  the  task  component  of 
the  model,  holding  the  other  components  in  abeyance. 

Toward  this  end.  major  components  of  a  task  were  identified  and 
treated  as  categories  within  which  to  devise  task  characteristics  or 
descriptors.  Each  characteristic  was  cast  into  a  rating  scale  format 
which  presented  a  definition  of  the  characteristic  and  provided  a  seven- 
point  scale  with  defined  anchor  -  and  mid-points  along  with  examples  for 
each  point.  Nineteen  scales  were  developed  and  evaluated  in  a  series  of 
three  reliability  studies. 

The  paradigm  used  to  determine  whether  the  task  characteristics 
were  correlates  of  performance  upon  which  predictive  relationships 


=aight  be  established  was  that  of  "post-diction”.  Post-diction  referred 
to  the  situation  in  which  performance  leisures  were  abstracted  fro* 
studies  already  existing  in  the  literature.  Subjects  rated  descriptions 
of  the  tasks  used  in  these  studies  on  task  characteristic  scales  and 
then  these  ratings  were  subjected  to  multiple  regression  analysis  to 
establish  the  extent  to  which  they  were  related  to  the  perforsance  in 
question.  Two  s«ch  post-dictioa  studies  were  conducted.  The  first  post¬ 
diction  study  involved  six  scales  and  26  tasks  while  the  second  study 
involved  six  scales  and  20  tasks. 

F i ndings : 

In  general,  it  was  found  that  a  subset  of  scales  having  adequate 
reliability  consistently  eaerged  in  all  three  reliability  studies.  The 
results  of  the  two  post-dictior.  studies  were  encouraging  in  that  signi¬ 
ficant  •'-•ltiple  correlations  of  .82  and  .73  were  obtained  between  task 
characteristic  ratings  and  the  perforsance  measures. 

Utilization  of  Findings: 

Although  a  final  interpretation  of  these  findings  cast  await  cross- 
validation  efforts,  it  does  appear  possible  to  describe  tasks  in  terms 
of  a  task-characteristic  language  which  is  relatively  free  of  the  sub- 
jec: ve  and  indirect  descriptors  found  in  many  other  systems.  Further, 
task  characteristics  may  represent  important  correlates  of  performance; 
as  shown  here,  it  was  possible  to  describe  subtle  differences  among  tasks 
and  to  relate  such  differences  systematically  to  variations  in  performance. 


DEVELOPMENT  OF  A  TAXONOMY  OF  HUMAN  PERFORMANCE:  THE  TASK  CHARACTERISTICS 
APPROACH  TO  PERFORMANCE  PREDICTION 

CONTENTS 


Page 


INTRODUCTION  1 

BACKGROUND  4 

Heuristic  Model  of  Performance  5 

Nature  and  Use  of  the  Task  Descriptive  System  6 

Classification  7 

Prediction  7 

Objectives  9 

SCALE  DEVELOPMENT  10 

Task  Definition  10 

Task  Characteristics  10 

RELIABILITY  STUDIES  15 

First  Reliability  Study  15 

Second  Reliability  Study  17 

Third  Reliability  Study  18 

Discussion  21 

POST-DICTION  STUDIES  24 

First  Post-Diction  Study  25 

Second  Post-Diction  Study  29 

Discussion  32 

CONCLUSIONS  AND  RECOMMENDATIONS  34 

REFERENCES  39 

DD  Form  1473  Document  Control  Data  RfeD  119 

APPENDICES  43 

FIGURES 


Figure  1.  Relationship  among  the  terms  ’’task," 

"components,"  and  "characteristics"  12 


TABLES 


Page 


Table  1.  Sample  task  characteristic  rating  scale  J  3 

2.  Reliability  estimates  for  three  judges 

using  original  scales  to  rate  37  tasks  16 

3.  Reliability  estimates  for  twenty-eight 

judges  using  revised  scales  to  rate  15  tasks  19 

4.  Reliability  estimates  for  two  judges 

using  eighteen  scales  to  rate  21  tasks  20 

5.  Listing  of  the  most  reliable  scale  within 

each  of  the  three  reliability  studies  23 

6.  Basic  data  for  the  first  regression  analysis  27 

7.  Intercorrelation  matrix  for  the  first 

regression  analysis  28 

8.  Basic  data  for  the  second  regression  analysis  30 

9.  Intercorrelation  matrix  for  the  second 

regression  analysis  31 

10.  Comparison  of  post-diction  studies  1  and  2  32 


DEVELOPMENT  OF  A  TAXONOMY  OF  HUMAN  PERFORMANCE:  THE  TASK  CHARACTERISTICS 
APPROACH  TO  PERFORMANCE  PREDICTION 


INTRODUCTION 


A  major  problem  confronting  the  behavioral  sciences  and  technologies 
is  the  lack  of  a  structure  within  which  to  describe,  interpret,  and 
organize  information  about  human  performance.  Without  such  a  structure 
limits  are  placed  on  the  extent  to  which  findings  from  different  studies 
can  be  compared,  contrasted,  and  integrated  into  a  systematic  body  of 
knowledge.  At  the  root  of  this  problem  is  the  absence  of  unifying  di¬ 
mensions  for  systematically  describing  those  antecedant  conditions  of 
which  performance  is  a  function. 

Of  the  many  conditions  which  can  influence  performance,  the  most 
poorly  described  and  the  least  understood  ere  those  embodied  in  the 
task.  As  a  consequence,  the  ability  to  relate  performance  observed  in 
one  task  to  that  observed  in  other  tasks  is  limited.  At  present, 
research  results  obtained  with  one  task  can  be  safely  generalized  only 

v 

to  other  tasks  which  are  so  highly  similar  as  to  be  almost  identical. 

The  ability  to  communicate  research  findings  unambiguously  is  similarly 
hampered.  Behavioral  scientists,  and  those  who  must  apply  research 
findings  to  operational  problems,  are  without  a  language  for  interrelating 
performance  on  different  tasks. 

A  burgeoning  research  literature  and  a  growing  demand  for  application 
of  findings  both  underscore  the  need  for  an  integrative  structure.  A 
system  is  needed  which  will  yield  better  predictions  of  the  effects  of 
independent  variables  on  task  performance.  Similarly,  a  system  is  needed 
to  predict  more  accurately  the  learning  rates  or  proficiency  levels 
associated  with  new  tasks.  These  needs  have  been  recognized  by  many 
investigators  (e.g.,  Fleishman,  1962,  1967;  Hackman,  1968;  Melton  5 
Briggs,  1960;  and  Miller,  1962).  Fitts  (1962)  in  particular,  has 
called  for  a  taxonomy  which  should  identify  important  correlates  of 
learning  rate,  performance  level,  and  individual  differences,  and  be 
equally  applicable  to  laboratory  tasks  and  to  tasks  encountered  in 
industry  and  in  military  service. 


The  key  to  establishing  such  a  taxonomy  lies  in  developing  a  well- 
defined  task  descriptive  language.  Earlier  reports  under  this  project 
(e.g.,  Farina,  1969;  Kheaton,  1968)  as  well  as  other  reviewers  (e.g., 
Ginsberg,  McCullers,  Merryman,  Thoeson,  6  hhitte,  1966)  suggest  that 
three  general  approaches  are  aost  prevalent.  They  differ  primarily  in 
terms  of  the  manner  in  which  description  is  accomplished. 

In  the  first  approach,  description  centers  os  the  specific  activities 
in  which  an  operator  engages  while  performing  a  task.  Interest  lies 
in  specifying  what  the  operator  actually  does.  Those  who  have  taken 
this  approach  (e.g..  Fine,  1963;  McCormick,  1568;  and  Reed,  1967)  are 
wore  concerned  with  describing  performance  per  sc  and  less  concerned 
with  the  conditions  giving  rise  to  that  performance.  In  the  second 
approach,  description  focuses  on  those  resources  of  the  operator  which 
are  required  for  performance  on  the  task.  Gagne  (1962)  and  Miller  (1966), 
for  example,  describe  tasks  in  terms  of  those  functions  or  processes 
which  the  operator  is  required  to  utilize.  In  a  similar  vein,  tasks 
have  been  described  in  terms  of  the  types  and  amounts  of  human  abilities 
upon  which  the  tasks  make  demands  (e.g.,  Fleishman,  1967;  Theologus, 
Romashko,  6  Fleishman,  1970}.  In  this  second  general  approach,  emphasis 
is  on  critical  aspects  of  the  individual  intervening  between  features 
of  the  task  and  consequent  performance. 

A  third  approach  to  developing  a  ta-^k  descriptive  language  treats 
the  task  as  a  critical  sub-set  of  the  ante cedant  conditions  of  which 
performance  is  a  function.  Hackman  (1968)  states  this  position  clearly: 

"...That  is,  if  we  are  interested  in  the  effects  of  tasks 
and  task  characteristics  on  behavior,  it  is  essential  that 
we  develop  a  means  of  describing  and  classifying  our 
independent  variables  (tasks)  other  than  in  terms  of  the 
dependent  variables  (behaviors)  to  which  we  ultimately 
wish  to  predict." 

Investigators  taking  this  tack  (e.g.,  Cotterman,  1959;  Fitts,  1962; 

Folley,  1964;  and  Stolurow,  1964)  attempt  description  in  terms  of  the 
characteristics  of  the  task  confronting  the  operator. 


2 


It  is  this  latter  approach  to  developing  a  task  descriptive 
language  which  would  sees  appropriate  for  the  t/pe  of  taxonomy  called 
for  by  Fitts.  In  order  to  eventually  predict  the  performance  which 
will  result  when  2  subject  is  exposed  to  a  given  situation,  one  oust  be 
able  to  specify  and  fully  describe  those  independent  variables  which 
are  in  effect.  Far*  of  this  specification  oust  necesrarily  include  that 
stimulus  complex  known  as  the  "task"  which  confronts  t.ie  subject.  It 
is  within  this  complex  that  sasy  correlates  of  learning  rate  or  profi¬ 
ciency  level  will  be  found.  Knowledge  of  these  variables  would  provide 
a  basis  for  comparing  performance  on  different  tasks.  They  would  also 
provide  a  basis  for  classifying  tasks  with  respect  to  the  behavioral 
consequences  of  other  classes  of  independent  variables. 

The  present  report  describes  a  series  of  studies  conducted  to 
develop  an  instrument  in  terns  of  which  the  stimulus,  procedural,  and 
response  characteristics  of  tasks  could  be  described.  It  discusses 
additional  studies  which  were  designed  to  determine  whether  dimensions 
comprising  the  descriptive  language  represented  correlates  of  human 
^erforaance. 


3 


BACKGROUND 


The  research  described  in  the  present  report  was  part  of  a  larger 
programmatic  effort  concerned  with  development  of  a  taxonomy  of  human 
performance  (Fleishman,  19t7;  Fleishman,  Kinkade,  S  Chambers,  1968; 
Fleishman,  Teichner,  C  Stephenson,  1970;  Fleishman  6  Stephenson,  1970). 

In  support  of  this  general  program  of  research,  several  alternative  task 
descriptive  systems  were  developed.  The  general  purpose  of  each  of  these 
systems  was  to  provide  a  basis  for  classifying  tasks  in  order  to  permit 
better  organization  and  increased  generalization  of  performance  data 
within  and  between  task  categories. 

Studies  described  in  the  present  report  were  concerned  with  the 
development  and  initial  use  of  one  such  system.  Known  as  the  task 
characteristics  approach,  it  attempted  to  provide  for  the  description 
of  tasks  in  terms  of  a  variety  of  task-intrinsic  properties  including 
goals,  stimuli,  procedures,  response  nodes,  etc.  The  decision  to 
describe  tasks  in  these  rather  morphological  terms,  instead  of  using 
■ore  behavioral-,  process-  or  ability-oriented  descriptors,  stemmed  from 
the  conviction  that  tasks,  in  their  own  right,  represented  a  potent 
class  of  independent  variables.  Accordingly,  if  the  variables  com¬ 
prising  a  task  were  manipulated  singly  or  in  combination  (e.g.,  creating 
a  number  of  different  tasks),  the  resultant  effects  on  performance  could 
be  mapped  systematically.  Knowledge  of  how  performance  varied,  as  a 
result  of  manipulating  the  characteristics  of  tasks,  would  provide  a 
basis  for  estimating  per f v~r.:3nce  on  other  tasks  whose  characteristics 
could  be  described. 

i».e  consequences  of  the  foregoing  rationale  for  development  and 
use  of  a  task  descriptive  system  were  explored  by  constructing  an  heuristic 
model  of  performance.  )n  turn,  this  model  helped  specify  what  was  to 
be  described,  how  description  was  to  be  accomplished,  and  how' the  task 
descriptive  indices  were  to  be  related  to  performance. 


4 


Heuristic  Model  of  Performance 


During  early  stages  of  the  project  an  heuristic  model  of  performance 
was  entertained.  The  model,  known  as  POET,  simply  stated  that  any  ob¬ 
tained  performance  score  (P)  was  necessarily  the  function  of  at  least 
three  major  classes  of  independent  variables.  These  included  the  particu¬ 
lar  task  (T)  on  which  performance  was  measured,  the  specific  operator 
(0)  whose  performance  was  monitored,  and  the  environmental  conditions 
(E)  under  which  performance  took  place.  Included  in  the  latter  class 
were  all  variables  (e.g,,  ambient  noise,  drug  dosages,  conditions  of 
practice,  etc.)  which  were  extrinsic  to  either  the  task  or  the  operator 
and  primarily  impinged  on  the  latter. 

The  POET  model,  therefore,  suggested  that  the  difference  in  perfor¬ 
mance  which  might  be  observed  when  comparing  two  experiments  could  be 
due  to  variations  within  any -one  or  all  three  of  the  major  tins' es  of 
independent  variables.  Observed  differences  in  performance  could  arise 
from  the  use  of  different  samples  of  operators,  or  from  different  tasks, 
or  from  the  application  of  different  treatments  (extrinsic  variables). 
Consequently,  it  seemed  obvious  that  any  system  which  was  developed  to 
permit  increased  generalization  of  performance  data  would  have  to  take 
all  three  classes  of  variables  into  consideration.  This  in  turn  meant 
that  descriptive  systems  would  eventually  be  required  for  each  o*.  the 
major  components  within  the  model. 

Instead  of  attacking  the  problem  at  this  general  level,  however, 
the  decision  was  made  to  develop  descriptive  systems  sequentially.  The 
issue,  therefore,  was  to  decide  upon  which  descriptive  system  to  place 
initial  emphasis.  There  appeared  to  be  a  variety  of  ways  in  which  to 
describe  different  operators  based  on  such  variables  as  age,  intelligence, 
abilities,  interests,  etc.  Indeed,  many  studies  have  been  conducted 
in  which  individual  differences  on  these  and  similar  "personal"  variables 
were  systematically  related  to  variations  in  performanc  ;.  By  the  same 
token  there  seemed  to  be  farily  adequate  description  ar.d  specification 
of  what  were  termed  the  "environmental"  variables.  In  most  cases 


descriptive  systems  dealing  with  this  component  have  been  sufficient 
to  permit  investigation  of  the  effects  of  different  levels  of  treatment 
upon  performance  of  a  large  number  and  variety  of  variables. 

While  description  of  the  operator  and  of  the  environment  seemed 
adequate,  description  of  the  task  component  was  not.  Most  of  the  avail- 
able  descriptive  systems  were  inadequate  because  they  failed  to  emphasize 
the  task  as  an  antecedant  condition  of  performance,  a  condition  which 
could  be  subjected  to  systematic  and  specifiable  manipulation.  Such 
systems  prevented  one  from  readily  talking  about  type  or,  more  signifi¬ 
cantly,  level  of  treatment  in  the  sense  that  he  could  for  the  operator 
and  environment  components.  Yet  the  ability  to  make  such  statements 
seemed  essential  if  one  were  to  investigate  the  effects  of  variations 
ir.  tasks  on  subsequent  performance.  Therefore,  while  recognizing  the 
importance  of  descriptive  systems  for  all  three  components,  the  decision 
was  made  to  focus  initial  efforts  on  a  task  descriptive  system.  As 
explained  in  a  later  section  of  this  report,  description  was  based  on 
a  variety  of  task  characteristics. 

Nature  and  Use  of  the  Task  Descriptive  System 

During  early  stages  of  the  project  consideration  was  also  given  to 
the  manner  in  which  the  descriptive  data  provided  by  the  system  were 
to  be  used  in  organizing  tasks  and  consequent  performance  data.  This 
issue  was  of  importance  for  it  was  felt  that  specification  of  the  intended 
use(s)  of  the  descriptive  data  would  culminate  in  a  set  of  requirements 
for  the  language  itself.  Two  major  uses  were  identified:  classification 
and  prediction.  Task  characteristics  data  would  provide  a  basis  for 
classifying  tasks  in  terms  of  their  observed  similarities  and  dis¬ 
similarities.  The  descriptive  data  could  also  be  utilized  within  a 
multiple  regression  context  to  relate  variations  in  the  characteristics 
of  tasks  to  variations  in  performance. 


6 


Classification  -  Although  several  alternative  approaches  to  the 
classification  of  tasks  were  considered  (Wheaton,  1968),  it  seeded 
desirable  to  approach  classification  on  quantitative  rather  than  on 
qualitative  grounds.  One  technique  available  for  this  purpose  was  the 
similarity  coefficient  described  by  Cattell  and  Coulter  (1966).  This 
coefficient  was  designed  to  describe  the  similarity  between  pairs  of 
profiles  in  terms  of  a  distance  function.  Therefore,  if  descriptive 
profiles  could  be  generated  for  tasks,  it  would  be  possible  to  mathe¬ 
matically  express  the  similarity  among  them  in  terms  of  a  matrix  of 
similarity  coefficients.  These  data  could  then  be  analyzed  by  cluster 
analytical  techniques  to  define  clusters  or  classes  of  highly  similar 
tasks  (Silverman,  1967).  Although  this  type  of  analysis  was  not  of 
primary  concern  in  the  present  research,  it  did  emphasize  the  need  for 
a  descriptive  system  which  treated  tasks  in  terms  of  quantitative  pro¬ 
files. 

Prediction  -  Another  use  to  which  descriptive  data  could  be  put  was 
in  predicting  learning  rates  or  proficiency  levels  on  tasks  for  which 
performance  data  were  not  already  available.  Emphasis  in  this  approach 
was  not  on  classifying  tasks  but  rather  on  identifying  those  charac¬ 
teristics  of  tasks  which  were  correlates  of  performance.  It  was  this 
latter  approach  which  was  pursued  in  the  present  study. 

A  multiple-regression  model  was  developed  in  which  task  character¬ 
istic  descriptors  were  treated  as  predictor  variables.  The  model  was 
based  on  the  premise  that  descriptive  terms  could  be  selected  which 
represented  correlates  of  performance  and,  as  such,  could  be  used  to 
predict  average  learning  rates  or  proficiency  levels  on  different  tasks. 
The  rationale  underlying  the  regression  approach  was  as  follows.  Suppose 
a  single  group  of  operators  performed  two  different  tasks  yielding  the 
same  type  of  performance  measures.  If  individuals’  scores  were  averaged 
on  each  task  and  if  these  two  means  differed,  then,  since  identical 
subjects  are  involved,  the  difference  between  means  could  only  be 
attributed  to  differences  between  the  tasks  themselves  (assuming  "environ¬ 
mental"  variables  to  be  identical  in  both  situations) .  The  difference 
between  tasks  would  be  specified  in  terms  of  task  descriptors. 


7 


If  the  concept  of  differences  between  tasks  and  consequent  differences 
between  performance  scans  were  extended  to  a  larger  set  of  tasks,  per¬ 
formed  by  the  sane  operators  under  the  sane  conditions,  then  a  variable 
(Fg)  would  be  created.  A  given  value  on  this  variable  would  represent 
the  scan  perforaance  score  associated  with  a  particular  task  (a)  within 
the  set  of  tasks.  It  was  hypothesized,  therefore,  that  specific  values 
for  this  variable  could  be  predicted  in  terns  of  task  characteristic 
scale  values.  The  multiple  regression  equation  required  for  that  pur¬ 
pose  would  have  the  following  fora: 


_ I 

P_  *  a  ♦  a.X_  +  a,XM 
a  o  1  as  2  1*2 


8  X 
n 


where 

=  predicted  mean  performance  score  on  task  'V 
an  =  regression  weight  for  the  nth  task  descriptor,  and 
X  =  the  value  for  task  'V  on  task  descriptor  "n". 

3D 


To  accomplish  these  ends,  however,  it  was  necessary  to  impose  a 
major  restriction  on  the  model.  The  tasks  under  investigation  at  any 
one  time  had  to  share  a  common  response  measure  (e.g.,  reaction  time, 
time  on  target,  percent  correct,  etc.).  This  restriction  had  profound 
consequences  for  it  implied  that  different  regression  equations  would 
be  required  to  handle  different  types  of  performance  measures.  Such 
would  not  have  been  the  case  had  it  been  possible  to  describe  different 
measures  of  performance  in  terms  of  a  single  common  metric.  The  absence 
of  this  universal  metric,  however,  made  it  necessary  to  categorize  tasks 
in  terms  of  the  measures  employed  to  describe  performance  on  them.  The 
categories  of  performance  described  by  Teichner  and  Olson  (1969)  were 
considered  for  this  purpose.  Separate  regressions  were  anticipated  for 
tasks  yielding  such  diverse  performance  measures  as  probability  of 
detection,  reaction  time,  percent  correct,  and  percent  time  on  target. 


8 


The  consequences  of  the  regression  model  for  the  descriptive  system 
were  readily  determined.  The  system  had  to  contain  multiple  dimensions, 
each  of  which  could  be  applied  to  any  selected  task.  The  dimensions 
had  to  be  quantitative  in  nature  and  had  to  possess  a  reasonably  high 
reliability.  Finally,  if  the  model  were  to  aid  in  predicting  parameters 
of  performance,  the  descriptive  dimensions  had  to  represent  correlates 
of  performance. 

Objectives 

Based  upon  these  background  considerations,  the  present  research 
attempted  to  accomplish  the  following  objectives.  A  series  of  generically 
applicable  quantitative  rating  scales  was  to  be  developed  for  descrip¬ 
tion  of  various  task  characteristics.  The  reliability  with  which  these 
scales  could  be  used  to  describe  tasks  was  to  be  determined.  Finally, 
the  feasibility  of  using  the  descriptive  data  as  predictors  of  mean 
levels  of  performance  on  different  tasks  needed  to  be  determined.  The 
remainder  of  this  report  describes  the  activities  conducted  in  pursuit 
of  these  objectives. 


9 


scau.  oematmn 


Task  Definition 


Tie  deveiopeest  of  task  daractemtic$  received  mtial  guidance 
fraK  a  definition  of  the  ten  ’"task*  which  was  devised  early  ia  the 
project,  Given  that  interest  lay  is  predict  lag  per  font  ace,  x  task  «as 
defined  as  a  potential  scans  of  eliciting  performance.  N are  specifically, 
it  referred  to  a  complex  situation  capable  of  elicitiq}  goal-directed 
performance  froa  aa  operator.  Gives  this  orieatatioa,  a  task  was  con¬ 
ceived  of  as  having  several  components  with  each  caeoactt  possessing 
certain  salient  characteristics.  These  components  mere:  as  explicit 
goal,  procedures,  input  stimuli,  responses,  and  srinu las- response 
relationships. 

An  explicit  goal  was  a  specification  of  the  ’‘state**  or  "conditio*” 
to  be  achieved  by  the  operator,  ty  ’’explicit”  was  meant  that  the  foal 
was  indicated  to  at  least  the  operator  sad  one  independent  observer, 
and  that  some  objective  procedure  existed  whereby  the  observer  coaid 
verify  whether  or  not  the  goal  had  bees  achieved.  A  task  also  had  to 
include  a  statement  of  the  '"■cans”  by  which  the  goal  was  to  be  attained. 
The  ’Wins"  consisted  of  procedures  which  were  statements  specifying  the 
types  of  stimulus-response  relationships  to  be  formed,  and  their  se¬ 
quencing.  Then,  too,  the  task  had  to  contain  a  set  of  relevant  infat 
stimuli  attended  to  by  the  operator.  Finally,  the  statement  of  the 
task  had  to  describe  a  set  of  responses  contributing  to  goal  attainment. 

Task  Characteristics 


Given  the  arbitrary  requirement  that  a  task  possess  these  components, 
it  followed  that  if  a  potential  "task”  did  not  possess  all  of  these 
components,  then  by  definition  it  was  not  a  task  under  the  present  system, 
and  if  an  operator  failed  to  perform  in  accordance  with  the  specified 
procedures,  the  question  of  goal  attainment  for  that  task  could  not  be 
raised.  The  operator,  by  definition,  would  not  have  performed  the  task 


10 


is  question;  in  fact,  he  would  have  performed  a  different  task.  This 
latter  point  led  to  a  direct  consideration  of  what  it  was  that  served 

O 

to  sake  tasks  different.  That  is,  given  that  all  tasks  had  the  above 
components,  what  distinctions  could  be  cade  within  these  common  components 
Meat  were,  for  example,  characteristics  of  a  task  goal  which,  when 
measured  in  so os  fashion,  would  serve  to  differentiate  among  various  task 
goals? 

'b  order  to  differentiate  among  tasks,  therefore,  the  components  of 
a  task  were  treated  as  categories  within  which  to  devise  task  character* 
•sties  cr  descriptors.  As  previously  mentioned,  additional  requirements 
were  set  forth  regarding  these  characteristics .  Each  had  to  be  applicable 
to  most,  if  not  all,  types  of  tasks  so  as  to  avoid  the  problem  of  not 
being  able  to  rate  or  measure  all  tasks  on  a  comparable  set  of  dimensions. 
Each  characteristic  had  to  be  expressed  quantitatively,  being  scaled  in 
at  least  an  ordinal  fashion.  Each  had  to  possess  sn  acceptable  degree 
of  reliability.  Finally,  to  achieve  economy  of  use,  it  was  desirable 
that  the  characteristics  require  a  minimum  of  training  time  and  appli¬ 
cation  time  on  the  part  of  the  user. 

Figure  1  clarifies  the  relationship  among  the  terms  "task",  "task 
components",  and  "characteristics".  Each  characteristic  was  cast  into 
a  rating  scale  format  which  presented  a  definition  of  the  characteristic, 
and  provided  s  seven-point  scale  with  defined  anchor-  and  nid-points 
along  with  examples  for  each  point  (Smith  $  Kendall,  1963).  A  sample 
rating  scale  is  shown  in  Table  1.  The  complete  set  of  19  scales  origi¬ 
nally  deve  oped  is  shown  in  Appendix  1. 

The  original  set  of  scales  has  undergone  changes  due  to  refinement, 
additions,  and  deletions.  Consequently,  the  appendix  section  contains 
three  separate  sets  of  task  characteristic  scales,  each  having  been  used 

in  a  separate  reliability  study*.  This  evolutionary  process  is  still 

* 

Three  sets  of  task  characteristic  scales  rather  than  one  final  set  are 
presented  since  there  is  no  "final"  set  in  the  sense  that  a  reader  could 
rate  a  task  on  it  and  then  apply  appropriate  Beta  Weights  to  gain  an 
estimate  of  performance  on  that  task.  The  research  is  still  in  its  early 
stages  where  a  demonstration  of  its  feasibility  is  the  issue  being 
addressed.  In  addition,  the  results  of  the  various  reliability  and  post¬ 
diction  studies  require  the  inclusion  of  the  specific  scales  and  tasks 
used. 


11 


TASK  TASK  COMPONENTS  TASK  CHARACTERISTICS 


Figure  l.  Relationship  among  the  terms  "task,  "  "components,  "  and  "characteristics 


Table  1 


SAMPLE  TASK  CHARACTERISTIC  RATING  SCALE 


VARIABILITY  OF  STIMULUS  LOCATION 

Judge  the  degree  to  »v  ach  the  physical  location  of  the  stimulus  or 
stimulus  complex  is  preiuctable  over  task  time. 


Definitions 


Examples 


High  predictability  -  stimulus 
location  remains  basically 
unchanged. 


•  Stimulus  is  a  red  light  located  on  a 
di-.play  panel. 


Medium  predictability  - 
location  changes'  but  in  a 
known  manner  or  pattern. 


•  Visually  following  an  arrow  in 
flight  toward  a  target. 


'Low-predictability. -  location 
changes  in  an  almost  random 

fashion. 


•  Predicting  which  leaf  will  fall  from 
a  tree  next. 


13 


not  complete,  but  it  has  progressed  far  enough  to  provide  a  demonstration 
of  the  basic  approach.  During  this  developmental  phase  the  task  charac¬ 
teristics  were  viewed  as  critical  independent  variables  which,  if 
manipulated,  would  influence  task  performance.  While  as  indirect  test 
of  this  view  was  attempted  in  the  "post -diction"  studies  fo  be  discussed 
later,  the  ultimate  test  would  entail  actual  manipulation  of  these 
characteristics  within  an  experimental  task  and  observation  of  concomitant 
changes  in  performance. 


14 


RELIABILITY  STUDIES 


First  Reliability  Study 

Following  development  of  the  original  set  of  rating  scales  a  series 
of  reliability  studies  j<as  conducted.  In  the  first  such  study  the  task 
characteristic  rating  scales  were  employed  in  ti.eir  original  fora.  Three 
research  assistants  were  trai sed  in  the  use  of  the  scales  and  were  then 
a'sied  to  rate  37  rather  sicple  psychesrctor  tasks  on  each  of  19  scales. 

The  task  descriptions  with  which  the  raters  worked  are  referenced  in 
Appendix  2. 


The  obtained  ratings  were  cast  into  analyses  of  variance  to  deter - 
sine  intraclass  correlation  coefficients  for  each  scale.  Following  the 
aethed  described  by  Ifiner  (1962,  p.  124),  two  coefficients  (r^.  and  r^) 
were  calculated-  The  r^  coefficient  provided  an  estimate  of  the 
reliability  of  the  Bean  of  the  three  judges’  (k  =  3)  ratings.  The 
coefficient  estimated  the  reliability  of  a  single  rating-  The  obtained 
coefficients,  together  with  the  variance  components  used  in  the  calcu¬ 
lation  of  Tj,,  are  shown  in  Table  2. 

Ifiner  (1962,  p.  126)  suggested  an  interesting  interpretation  of  the 
intraclass  correlations.  Each  r^  coefficient  was  an  estimate  of  the 
correlation  which  would  be  obtained  were  the  mean  ratings  of  the  present 
three  judges  correlated  with  the  sean  ratings  from  another  random  sample 
of  three  judges  rating  the  save  tasks.  Using  an  Tj,  equal  to  or  greater 
than  0.70  as  an  arbitrary  index  of  acceptable  reliability,  seven  of  the 
19  original  scales  appeared  to  be  adequate. 

Three  of  the  scales  (7,  12,  18j  shown  in  Table  2  possessed  r^’s 
with  negative  values.  Theoretically,  may  range  in  value  from  zero 
(o)  to  plus  one  (+1).  In  practice,  however,  it  can  be  demonstrated 
that  r^  will  assume  a  negative  value  in  those  cases  where  the  mean- 
square  within  term  is  greater  than  the  mean-square  between  term  (e.g.. 


=  1  - 


MSv, 


lk  "  MSb^’  *nterPretat*on  of  such  negative  r^  coefficients  is  difficult. 


IS 


RELIABILITY  ESTIMATES  FOR  THREE  JUDGES  USING 
ORIGINAL  SCALES  TO  RATE  37  TASKS 


1 1 

3 

TJ 


o  rt 

"V 

*»  c 

i—t 

Xi  ^ 

a  x 

£  u 

"o 

C !) 

a 

u 

t) 

> 

C 


X 

M  4) 

ro  O 

*  S 

G  •-« 
*G  * 

2* 


Tj>infOTj«cor^r-Ov£/'^>'^<t<>if>oo.Of-ooMcn 

o'  o  o’  o  o  o  o  o  c  o  d  e  o  o  d  c  o’  o  o 

i  t  i 


'fOOOOrfOONOrtWuiTfi/W'lNMOD 

t30'^t''}<>ONiriNN'0NMr'COlANOO 

OOOOOOCOOOO^OOOOOOO 
I  (  I 


M'/imNTf-<T)'-iMuncsO'oot'X-toor>LN 
T},t^Tj<iri'Ooot^Tj<foas033''OOt'-rox}'rocr' 
fONNNMOCO'llsOi)l-*'OiflNtON^OO 


IN)  O  H  M 


— <  t  O  O  O  •“«  r-4  •— *  O  r—1  f\J 


X 

to 

nj  <U 

H  u 


c 

OJ 

4) 

* 

<U 

CQ 


i-i 

n! 

> 


f..1-Jl,)OCOlflO'0't\0'<4'OO^IOt“COt' 
nvi;,  KlO>0Tfr|iO00r-4‘(M-<t'^4‘'4l'0O 
OC0^NOONON>0°N^'0>0>0M00N 


16 


Inspection  of  the  rating  data  showed  that  '■.he  three  judges  were 
actually  in  strong  agreement  on  Scale  #  12.  However,  the  judges  were  not 
able  to  differentiate  among  tasks  very  effectively,  as  shown  by  the 
relatively  small  between-task  variance  component  for  this  scale.  Eval¬ 
uating  this  scale  on  another  and  more  heterogeneous  sample  of  tasks 
would  either  raise  its  estimated  reliability  or  confirm  its  insensitivity. 
Scales  #  7  and  H  18  had  relatively  large  within-task  variances  suggesting 
a  lack  of  consistency  among  judges.  Inspection  of  the  actual  ratings 
confirmed  this  impression,  particularly  in  the  case  of  Scale  #  7  where 
judges  were  in  confusion  about  the  end-points  (1  or  7)  of  the  scale. 


Second  Reliability 


After  the  first  study,  many  of  the  original  scales  were  examined 
in  an  attempt  to  improve  their  reliabilities.  Some  scales  were  deleted 
and  others  underwent  minor  or  major  revision  to  clarify  the  exact 
nature  of  the  dimension  being  rated  and  the  meaning  of  the  scale  anchor 
points.  The  resulting  instrument  consisted  of  16  scales  (Appendix  3). 

In  an  attempt  to  estimate  the  reliability  of  the  revised  scales,  28 
judges  rated  20  tasks  on  each  scale.  The  28  judges  were  college  students 
recruited  from  a  local  university.  Prior  to  the  actual  study,  the 
judges  were  thoroughly  familiarized  with  the  meaning  of  each  scale  and 
with  the  rating  procedure.  The  judges  were  paid  for  their  participation. 


Reliability  estimates  were  obtained  for  each  of  the  16  scales. 

These  data  were  based  on  only  15  of  the  20  tasks  which  were  actually 
rated.  The  five  tasks  which  were  eliminated  were  cognitive,  paper-and- 
pencil  tasks.  They  were  originally  included  to  determine  whether  or 
not  the  judges  could  describe  them  reliably  in  terms  of  the  task  charac¬ 
teristics.  The  judges  were  largely  unsuccessful  in  this  effort.  Con¬ 
sequently,  it  was  decided  to  limit  use  of  the  scales,  at  least  initially, 
to  psychomotor  tasks.  Descriptions  of  the  15  tasks  which  were  finally 
analyzed  in  terms  of  and  r^  are  shown  in  Appendix  4. 


17 


The  reliability  estimates  are  shown  in  Table  3  together  with  the 
relevant  variance  components.  The  striking  feature  of  these  data  was 
the  relatively  low  reliability  for  an  individual  rater  (r^).  Were  only 
one  judge  of  the  type  employed  in  this  study  to  assign  ratings,  he  would 
be  fairly  reliable  only  on  one  scale  (#  IS).  More  reliable  rating*  could 
be  obtained ,  however,  were  the  mean  ratings  of  either  three  or  five 
judges  utilized.  Using  the  Spearman- Brown  Prophecy  Formula  (Winer,  1962, 
p.  127)  it  can  be  shown  that  if  r^  ■> .33,  then  r ^  >_  .60  and  r^  5^ .71. 

On  this  basis,  adequate  reliability  could  be  expected  on  at  lea't  seven 
scales.  The  remaining  scales  appeared  to  need  additional  revision. 


Third  Reliability  Study 


Finally,  additional  reliability  data  were  obtained  during  an  analysis 
of  21  tracking  tasks  (see  Appendix  6)  under  a  contract  with  the  U.  S. 

Naval  Training  Device  Center.  In  this  effort  two  judges  evaluated  the 
tasks  in  terms  of  many  different  measures,  including  18  task  character¬ 
istic  scales.  The  18  scales  (Appendix  5)  represented  revised  versions 
of  many  of  the  earlier  scales.  In  this  study  both  judges  were  highly 
familiar  with  the  scales  and  the  procedures  for  their  use. 


As  shown  in  Table  4,  the  rating  data  from  this  study  were  evaluated 
in  several  ways.  First,  as  in  the  preceding  studies,  analyses  of 
variance  were  conducted  which  permitted  calculation  of  the  intraclass 
correlation  coefficients  (r^  and  r^).  Second,  similarity  coefficients 
(Tp)  were  computed  which  expressed  how  similar  the  two  judges  were  in 
evaluating  the  tasks  on  each  scale.  The  technique  was  essentially  one 
of  profile  analysis.  The  r^  statistic  (Cattell  §  Coulter,  1966)  could 
range  in  value  from  -1.0  to  1.0  being  asymptotic  with  respect  to  -1.0. 

An  r  value  of  1.0  meant  that  the  two  profiles  fell  on  exactly  the 
same  point  in  multi-dimensional  space.  An  Tp  of  -1.9  meant  that  the 
two  profiles  were  maximally  dissimilar.  Finally,  for  each  scale  the 
number  of  times  the  two  judges  were  within  plus  or  minus  one  scale  unit 
of  each  other  was  determined  and  expressed  as  a  percentage  of  21  cases. 


18 


vOCOOOt^iftOOrMrJ'O— •  j~-  rf  « 
<},(M'J‘if><MOvO^},^^i,'-<(\J(\J— <  — i 


r'<o«inf*MOocoifl(,i'iTi'co— i  (\j  co 

O'  O'  O'  O'  O'  O'  to  O'  O'  O'O'OOOOO'OOt^ 


sO  00  to  O'  'OCOOr>Ocnr,~,~iC~ 

OMr~or<'ir-i^-rv-rr>'^,'-,'Ovor^M,xf 

N'O^T)‘>fir(MNMa'rt^)Tf|\l  Tf  in 


^ot^ino'^'-Hroinou^Tjto-H.or^ 

NfOOONOCOtOrtivjift^MfOMrf 
h  J  ifl  O  00  h  0(MX)inulNinrri(>^ 
*******  •••••«••* 
in  ^  CO  >0  ^  vO  ^mOOvDOOCOO^ri 
O'  >0  H  o  rO 


>■  . 
y  £ 

*H 
<D  r—t 

W  TJ  5 

a  C  ■% 

D  D  .rJ 
4->  Q.  P 

W  5T  * 


CX  r*  C 

S  o  o 

S  u  ° 


rjg(U^OC*f(ip85o£tO 

^fc^f^pjWwsfeQ^toajwo? 


H(\]ro^i^or>-co(>o*H(N3co^ir)so| 


RELIABILITY  ESTIMATES  FOR  TWO  JUDGES  USING 
EIGHTEEN  SCALES  TO  RATE  21  TASKS 


v 

oo  a  ' 
a  v  , 
+»  r?  1 
G  C 

V  V 

u  « 

U  U  I 
<0  00 
CL  <  . 


wt*  wO  fcO  wO  wO  wp  .O  vO  wp  Vp  vP  t.O  yO  yP  wO  yO  vP 

0s*  ^  ^  ^  ^  ^  ^  fe**  6s-  &*• 

O  <G  M  <M  iGOOOOnGvGCJ— <  O  — <  O'  O' 

O'-  r~  so  o  ooCT'OOcoa>'Ocoot^»-!(N' 


•H 

u 

flj 

r—* 

•H 

s 

•  rl 

w 


g 

3! 


0) 

o 


u 


* 

* 

* 

* 

* 

* 

* 

■Sr 

* 

* 

* 

* 

* 

* 

■Sr 

* 

o 

O' 

o 

vO 

>g 

CO 

o 

CO 

o 

CO 

00 

pH 

t'- 

vO 

o 

CM 

O 

pH 

pH 

fvj 

CO 

vO 

-G 

sQ 

■'* 

co 

H 

to 

vO 

00 

O' 

pH 

o 

CO 

pH 

H* 

o 

pH 

pH 

o 

•4* 

(M 

CO 

o 

pH 

CO 

H 

o 

o 

o 

d 

o 

o 

o 

o 

o 

d 

o 

o 

o 

o 

o 

o 

o 

III  I 


OOMlMOOlMfNJf-iOOinvOOfMO^fMO 


— 'OOOOOOOOOOOOOOOOO 
I  lit  II 


3  X 
a 


tr  00 
K  as 

u 

V 


o  in  co  OifliiiijiHiflinoininfflOooNH 

OOrtiTfHrtNitiMO'M^Hrtrt  rO  O 


<-<00000000000000000 

I  III  !  I 


(0 

H 

c 

•H 

4J 

£ 


<u 
u 
C 
aS 
«- * 
u 
al 
> 


oo'omm-.ou'ivfl 
O  <\J  ~i  O'  CO  00  O  O  00 
OTfOOOmnOIMM 


O  O  O  fM  — <  O  *— <  OO 


rtN  ^  co  t-  oo  n  in  \r, 
^lANfOvOMinOO 
<-<oinr-vO(NJ'^’a'T}i 


OOOrOOOrvJ^^f 


J»S 

0) 

nJ 

H 

G 

4> 

(U 

* 

y 

« 


a; 

u 

a 

a 

•  tH 

U 

a) 

> 


00 

G  <D 

•  H  p—< 

as  o 
Oh  OO 


mts-C''<-<iooNi},ioo'io<HH<s<-<t^'i'n-is- 

oot-coo'Mr'tN^cocOHM-ii^vOin 

o<fui'(|incnininNOHH(<iONr'M'f 


r~-  o  — i  o-h  oo— <  Onjcorj^-iOHcois* 


G 

G  S 
.2  6 
t  * 


rd 

(4 

0 

H 

X 

u 


G 

0 


i*  >> 

y  y 
G  G  • 

4)  y  G 

X)  T)  0 

■G  G  U  S 
y  y  is 
a.  o<  .  £< 


t»  co 

<->  <o 


y 


x 

o 


p<  w  aS 


§ 

O 


d.  .q  +* 


»h  a} 
aS  M 


y  _y  a  g  <o  y  ts 


D  o  y-M'G'G  to  Co'S  «! 

d  °  si  al  i  lyg^^y 

nbP^^ftnjhZaiMitti^h 


-  S 

<*h  tj 

W  c/j 


js 

aS 


G  g  <2 

M  W  S-i 


00 
4>  . 
OP  CO 


^  McOTj'invO>co^or4iMco^invorwco 

H  ^  H  pH  pH  H  rH  pH  pH 


20 


TO  *  >  d** 

SO  *  >  d* 


Interpretation  of  the  intraclass  coefficients  shown  in  Table  4  was 
again  difficult.  Four  r^  coefficients  were  above  0.70  and  appeared  to 
represent  reasonably  reliable  scales.  In  terms  of  the  similarity  co¬ 
efficients  (r  )  ten  were  significant,  implying  agreement  between  judges* 
profiles.  Finally,  on  eight  scales  the  judges  were  in  agreement  at 
least  90%  of  the  time.  Only  three  scales  (#  4,  #  13,  and  #  16)  failed 
to  exhibit  either  a  high  r,  (r.  >  .70),  a  significant  r  ,  or  a  high 

K  K  —  p 

percentage  (90%)  of  agreement. 

Discussion 


Our  experience  in  assessing  the  reliability  of  the  task  characteristic 
scales  indicated  that  the  statistical  methods  used  often  tended  to  pre¬ 
clude  a  definitive  answer  to  the  question  of  scale  reliability.  Inter¬ 
pretation  of  the  intraclass  correlation  technique  proved  troublesome 
when  a  small  but  consistent  bias  existed  among  raters  in  the  use  of  a 
scale,  and  each  rater  assigned  but  one  scale  value  to  all  tasks.  In 
these  instances  the  question  was  whether  the  tasks  were  truly  homogeneous 
with  respect  to  those  scales  or  whether  the  scales  were  insensitive  to 
differences  among  tasks. 

The  similarity  coefficient  technique  (y^)  also  yielded  cases  where 
an  inspection  of  the  raw  ratings  was  required  before  an'  interpretation 
could  be  made.  Finally,  the  percent  agreement  data,  while  intuitively 
appealing  in  their  logic,  lacked  any  formal  status  as  a  statistic. 

The  entire  issue  of  reliability  as  ’.t  applied  to  the  rating  data 
was  not  clear-cut.  Test-retest  reliability,  for  example,  would  assess 
how  consistent  an  average  rater  was  in  applying  a  particular  scale.  It 
would  not  address  itself  to  the  equally  important  question  of  how  well 
the  raters  would  agree  among  themselves  in  their  collective  use  of  a 
scale.  Similarly,  the  intraclass  correlation  coefficient  did  shed  some 
light  on  inter-rater  agreement,  but  it  appeared  to  require  some  unknown 
amount  of  heterogeneity  among  the  tasks  being  rated  to  do  so.  Ideally, 
one  would  want  each  rater  to  be  highly  consistent  in  his  use  of  a  scale 


I 


I 


on  a  test-reies:  basis,  and  3 Iso  tc  hire  raters  is  high  spetaeat  oa  a 
scale's  use  across  tasks.  Ksfostsateljr,  so  oos  statistical  tecaaiqoe 
seemed  applicable  to  aasessi^  both  cf  these  aspects. 

Scgardirg  the  scales  themselves,  it  appeared  that  a  subset  cf  scales 
consistently  emerged  which  had  adequate  reliability  in  all  three  stadies- 
Table  5  shows  the  sets  cf  scales  for  each  study  which  were  most  reliable. 
fbere  was  a  high  degree  cf  consistency  between  tbs  relii  e  scales 
emerging  from  the  three- jsdge  sad  2S- judge  studies.  Contparisj  this 
common  subset  to  the  reliable  scales  of  the  two- judge  steady,  four  of  the 
six  were  zgsia  reliable.  Additional  scales  were  also  reliable  bat  these 
were  employed  only  in  the  two- judge  study. 

In  general,  consideration  of  these  three  reliability  studies  led 
to  the  following  reeasraeadations: 

(a)  the  raters  should  hare  a  background  in  psychology  or  human 
factors,  or  a  good  awareness  of  such  concepts  as  stimulus  and 
response; 

03  3t  least  three  raters  should  be  used  in  applying  the  scales 
in  their  present  form,  with  aa  average  of  their  ratings  being 
used  as  the  value  to  be  assigned  to  the  characteristic  in 
question; 

(c)  further  development  of  the  scales  should  go  in  the  direction 
of  enumeration  (counting)  rather  than  rating;  and 

(d)  further  efforts  should  include  an  assessment  of  test-retest 
reliability. 


22 


< 

ca 


£— aj 
*-«C3 


M3 

53  £- 

c/5 

<: 

o>- 

»-« 
£3  »3 
c-O 

CSSS 

o  << 

*-o  P-* 

o  cs  J 
aS  C3C3 

E-  cces 

t«a 

mm 

Oc£ 


S3 

SC  £3 
t-  IS 

e.** 

oc. 

o 

33 

=5: 

►n 

S-* 

SO 


— >  CS 


S3 


B3  -o  p~ 


es  ©» 


1 

to! 

© 

u 

•3 

oo 

»oj 

sol 


5  is 

c  « 

3  A 

3  JS 
CL.-T 
■c»  > 


u 

© 

J3 


aS 

£  ^ 
SJ  15 

|o 


*33' 

as 

a  ^ 

"S  JO 

«*» 

-i2  © 

afi  3D 

£  ° 
c  cl 

«  £ 

3 -3 
C  aS 

so  g 

3  J!  “ 
C.J*  Jf 
"  O  0 

0^5 


as 

S3 

£ 


as 
© 

© 

c  — 

O  es 

ft  So 
OB  SJ 
©  *33 

>4  © 

^  O 

o  ° 

N  ft,  o 

*» 

~o  w«  ^ 

9  0  " 

c  .  :s 

as  *«  — 

J5  3 


as 

© 

a£r 

C 

9 

ft 

as 

se 


A  O 


O 

c 

-  .£ 
**• 

3D 


pj-t:  ad  •<?p 

c  £  b-H 

C  SS  ®  «S 


«6  e 

3  w  /j—  (I  j. 

3Za>5p: 


.-  02 


to 


•s; 

33 

•u 

10 

Ji 

h 

s 

<*> 


d 

*s  * 

3  A 

3  ~ 
ft^: 


^  s. 

— >  o 
<0  » 

£  ° 
c  c- 


s  i.  ®  e 


O  ts 

ft  S< 
a  0 
O  *C> 

s*  e 
w<  o 

°  2 

>>.  c. 


—  —  «M 

c  ij  «-  2  ° 

o 


3h°o£ 


3  A  ai'Z  -2  as  ««  U 

«  o  «  P  £  ft  fc  * 

3*22-53°©® 


0 

9 


H  ”5 


M  V 

«  —  -W 


C  c 
C  O 


°  o* 

V-«  <H 

°  c 
fa  o 
c  -» 

£  3 

|Po^532^5 


c* 

ca 


»<  3 

ii 

ii- 

*>  o 

w  © 


oj 


CO 


•tl*  in 


rT 

o 

m 

— 

OS 

cs 

OS 

t 

_ 

^■9 

=Jte 

0e 

as 

oz> 

ah 

© 

c 

© 

«>■ 

© 

c£ 

<tfi 

© 

C? 

© 

50 

CCD 

3J 

© 

as 

© 

as 

«S 

rs 

•SI 

•Of 

© 

& 

© 

od> 

4U* 

© 

OS 

© 

1  ■— 

m0* 

*"* 

© 

© 

© 

So 

X 

So 

itS 

at 

© 

a* 

iSi 

© 

© 

© 

JC 

jS 

0-0 

*» 

■o* 

© 

© 

© 

So 

X 

So 

© 

© 

© 

0“» 

£• 

5 

St 

? 

cvs 

eo 

or 

© 

© 

© 

)»»** 

ja 

J2 

<a 

eS 

aS 

r< 

£r* 

b* 

a 

C 

C 

»«■* 

3 

£5 

3= 

o 

o 

£> 

r© 

■-— 

Jw 

OS 

as 

© 

© 

©  9 

Sc 

fa 

fa 

aS 

<S5 

© 

ra 

OS 

®  > 

© 

© 

©  Tt 

2  o 
~  *> 

35  . 

s  ^ 

65  2 
o  s 

«Q  C- 

* 

a 

_  © 

2  > 

C3 

©  © 

©  u 

ca  ~-c 

«  fa 

0  o 

cC?  -w 

© 

A,  C- 

-©  ^ 

^  o*. 

■**  o 

X  g 

O  *t 

*>  © 

^  m 

o  g 

fa  zr 

o  ^ 

w 

^  2 
ci 

*M  C 

d  in’ 

3  43 

■w 

♦» 

s3  _ 

d  cfl 

rf  — ■ 

"JJ  3J 

c 

*3  _ 

*3 

•3 

5».  d 

^  C 

X<> 

4> 

«  © 

-ca 

*4w 

Z3  <-• 

S'*'- 

Jj 

JQ  ^ 

J3 

-S  O 
^-<  •“* 

«#  - 
^  o 

o 

o 

o  - 

cS  c> 

C?  CO 

n;2J 

CM  CO 


23 


POST-DICTION  STUDIES 


The  paradigm  used  to  determine  whether  the  task  characteristics 
were  correlates  of  performance  upon  which  predictive  relationships 
might  be  established  was  that  of  "post-diction".  Post-diction  simply 
refers  to  the  fact  that  existing  criterion  data  were  used,  whereas  in 
prediction,  arrangements  are  made  to  coileet  data  in  accordance  with 
scans  specific  experimental  design.  Post-diction  sacrific.es  precise 
control  over  ssan y  variables  in  order  to  rapidly  acquire  a-  relevant  set 
of  data  for  analysis.  Ratings  were  made  of  the  tasks  used  in  these 
studies  and  then  these  task  characteristic  ratings  were  entered  into  a 
multiple  regression  analysis  to  establish  the  extent  to  which  they  were 
related  to  or  predictive  of  the  performance  in  question.  -  The  task 
descriptions  in  the  literature  were  often  too  brief  to  use,  but  it  was 
possible  to  obtain  detailed  descriptions  from  either  a  study's  author 
(e.g.,  Fleisfean  in  the  first  post-diction  study),  or  by  acquiring  the 
references  an  author  made  to  sore  detailed  descriptions  of  the  task/ 
apparatus-  Through  these  means  it  was  possible  to  provide  the  judges 
with  explicit  description  of  the  tasks  to  be  rated.  Bnploying  the 
post-diction  paradigm,  two  studies  were  conducted. 

Both  studies  shared  a  nuaber  of  cosaon  restrictions.  First,  in 
selecting  studies  for  the  two  post-diction  efforts,  there  was  the  need 
to  have  a  cosson  metric  of  performance  within  each.  That  is,  the 
studies  used  for  any  one  regression  analysis  had  to  be  comparable  in 
teres  of  the  unit  of  performance.  Thus,  for. the  first  post-diction  the 
performance  measures  of  all  studies  was  expressed  in  terns  of  "the  nua¬ 
ber  of  output  units  produced  per  unit  tine”.  The  second  post-diction 
used  studies  in  which  the  common  performance  metric  was  "percent  time 
on  target”.  In  general,  this  need  for  a  common  metric  served  to  reduce 
the  number  of  studies  available  for  analysis.  The  relatively  small 
number  of  studies  in  both  post-diction  efforts  created,  in  turn,  the 
following  problems: 


1.  For  a  regression  analysis  the  number  of  predictors  should  not 
approach,  let  alone  exceed,  the  number  of  cases  sampled.  As  the  number  of 
of  predictors  (i.e.,  characteristic  scales)  approaches  the  number  of 
cases  sampled  (i.e.,  studies  or  tasks),  the  multiple  regression  coefficient 
becomes  spuriously  large  and  uninterpretable.  Since  this  was  the  case 
initially  in  both  post-dictions,  the  decision  was  made  to  use  only  a 
selected  set  of  the  task  characteristic  indices  as  opposed  to  the  full 
set.  For  example,  instead  of  using  19  indices  and  26  tasks  in  the  first 
regression  study,  a  smaller  set  of  six  indices  was  used. 

2.  The  small  number  of  studies  sampled  precluded  any  meaningful 
attempt  to  perform  the  important  step  of  cross-validating  the  resultant 
regression  equations. 

First  Post-Diction  Study 

The  first  post-diction  study  was  based  on  a  portion  of  the  data 
(Fleishman,  1954)  used  to  conduct  the  reliability  study  described  earlier 
in  which  three  judges  rated  37  tasks  on  19  scales.  Applying  the  require¬ 
ment  for  a '-common  performance  measure,  the  37  tasks  were  carefully 
screened  in  order  to  determine  the  types  of  performance  measures  associ¬ 
ated  with  them.  Although  several  different  measures  were  represented 
(e.g. ,  reaction  time,  percent  time  on  target,  or  percent  correct),  26  of 
the  tasks  had  one  measure  in  common  which  was  designated  as  the  "number 
of  units  produced  per  unit  time".  The  "units"  varied  and  included  such 
things  as:  number  of  blocks  moved;  number  of  assemblies  completed; 
number  of  taps  made;  and  number  of  correct  discriminations  given. 

Common  to  these  26  tasks  was  the  requirement  that  as  many  "units"  as 
possible  be  produced  during  specified  time  periods.  Since  different 
amounts  of  time  were  allowed  for  completion  of  the  various  tasks  (e.g., 

25  to  900  seconds),  a  common  time  frame  was  needed  to  provide  a  standard 
basis  for  comparison.  The  "unit  time"  chosen  for  this  purpose  was  one 
second.  Therefore,  the  performance  score  reported  for  each  task  was 
prorated  to  obtain  the  average  number  of  units  produced  per  second  (i.e., 
98.5  units  produced  in  80  seconds  equalled  1.231  units  per  second).  (The 

26  tasks  are  indicated  by  asterisks  in  Appendix  2.) 

25 


Since  the  entire  set  of  19  rating,  scales  (Appendix  1)  could  not  be 
employed,  a  smaller  subset  was  selected^  The  six  Host  reliable  scales 
were  chosen  for  analysis  (see  Table  2} .  For  each  of  these  scales  the 
ratings  provided  by  three  judges  were  averaged  to  obtain  a  single  value 
on  each  scale  for  each  of  the  tashs.  The  specific  scales  employed  in 
the  study  were: 

1.  Stimulus  duration  (scale  3  IS), 

2.  Number  of  output  units  (scale  @1), 

3.  Duration  for  which  an  output  unit  is  maintained  (scale  8  2), 

4.  Simultaneity  of  responses  (scale  *  9), 

5.  Number  of  procedural  steps  (scale  8  10),  and 

6.  Variability  of  stimulus  location  (scale  8  14). 

Table  6  presents  the  data  on  which  the  first  post-diction  study  was 
based.  A  Wnerry-Doolittle  stepwise  regression  analysis  was  carried  out 
by  computer.  Six  predictor  variables  were  entered  into  the  analysis, 
but  only  five  were  processed.  The  order  in  which  the  scales  are -listed 
above  represents  their  order  of  extraction  based  upon  the  percent 
variance  accounted  for  in  the  criterion  measure  (R^).  Although  five 
scales  emerged  from  the  analysis,  a  point  of  diminishing  returns  in 
terms  of  percent  variance  accounted  for  was  reached  after  extraction  of 
the  fourth  scale.  Consequently,  a  regression  equation  was  written  using 
only  the  first  four  scales  listed  above.  The  half -diagonal  intercorrela¬ 
tion  matrix  for  all  seven  variables  (six  predictors,  one  criterion)  is 
presented  in  Table  7. 

The  multiple  correlation  coefficient  for  this  analysis  (based  on 

four  predictors)  was  R  *  0,85  which  accounted  for  72%  of  the  variance 
2 

(R  )  in  the  criterion  measure.  This  correlation  was  significant 
(F  (4,  21)  =  13:75,  p<.01).  It  was  felt,  however,  that  the  small 
sample  (n  =  26)  used  in  this  analysis  yielded  an.  inflated  multiple  R 
relative  to  what  might  have  been  obtained  had  a  larger  sample  (n  =  >  100) 
been  used.  Accordingly,  a  correction  in  R  for  small  sample  bias  (Guilford, 
1956,  p.  399)  was  applied.  The  corrected  correlation  (  R)  was  0.82, 
which  was  still  significant  (F  (4,21)  =  10*78,  p<*01)* 


Table  6 

RSSIC  BATA  FOR  IKE  FIRST  REGRESSION  ANALYSIS 


Tasks 

Avg.  No.  Units 
Produced  Per 
Second 

1 

Average  Rating 
on  Six  Scales  * 

2  3  4  5 

6 

b  Two-Plate  Tapping 

3. 98 

i 

7 

.  1 

i 

1 

4 

2.  Key  Tapping 

6.24 

f 

7 

1 

1 

1 

7 

3.  Ten-Target  Aiming 

2.02 

4 

6 

1 

1 

J 

4 

4.  Rotary  Aiming 

2.49 

4 

7 

i 

1 

2 

4 

5,  Hand-Precision  Aiming 

1.  87 

4 

7 

1 

1 

1 

4 

6.  Visual  Reaction  Time 

2.71 

4 

1 

1 

1 

1 

7 

7.  Auditory  Reaction  Time 

2.86 

4 

1 

1 

1 

1 

7 

8-  Minnesota  -  Placing 

1.23 

4 

7 

i 

1 

1 

4 

9.  Minnesota  -  Turning 

1.49 

4 

7 

1 

5 

3 

4 

10.  Purdue  Pegboard  -  Right  Hand 

0.56 

4 

7 

1 

I 

1 

4 

11.  Purdue  Pegboard  -  Both  Hands 

0. 87 

4 

7 

1 

4 

1 

4 

12.  Purdue  Pegboard  -  Assembly 

0.62 

4 

7 

1 

3 

4 

5 

I'3.  O'Connor  Finger  Dexterity 

0.53 

4 

7 

1 

1 

1 

4 

14.  Santa  Ana  Finger  Dexterity 

1.80 

4 

7 

1 

i 

1 

4 

15:  Pin  Stick 

1.26 

4 

7 

1 

1 

1 

4 

16.  Dynamic  Balance 

0. 04 

4 

7 

2 

4 

l 

2 

17.  Medium  Tapping 

1.34 

4 

7 

1 

1 

1 

4 

18.  Large  Tapping 

1.26 

4 

7 

1 

1 

1 

4 

19-  Aiming 

1.31 

4 

7 

1 

1 

1 

4 

20.  Pursuit  Aiming  I 

2.32 

4 

7 

1 

1 

1 

4 

21.  Pursuit  Aiming  II 

1.76 

4 

7 

1 

1 

1 

4 

22.  Square  Marking 

1.16 

4 

7 

1 

1 

1 

4 

23.  Tracing 

24.  Discrimination  Reaction 

1.89 

4 

7 

1 

1 

1 

3 

Time  -Printed 

0.38 

4 

7 

1 

1 

1 

3 

25.  Marking  Accuracy 

1.37 

4 

7 

1 

1 

1 

4 

26.  Verbal  Addition  Task 

Q.,19 

4 

5 

1 

1 

1 

7 

* 

The  six  scales  were:  1.  Stimulus  duration;  2.  Number  of  output  units; 
5.  Duration  for  which  an  output  unit  is  maintained;  4.  Simultaneity  of 
responses;  5.  Number  of  procedural  steps;  and  6.  Variability  of  stimulus 
location. 


Table  7 


INTERCORRELATION  MATRIX  FOR  THE 
FIRST  REGRESSION  ANALYSIS 


1  2 

3 

4 

5 

6 

* 

7 

1 

1.00  .01 

-.06 

-.12 

-.10 

.27 

.78 

2 

1.00 

.07 

.15 

.12 

-.70 

-.19 

3 

1.00 

.45 

-.07 

-.38 

-.26 

4 

1.00 

.55 

-.23 

-.28 

5 

1.00 

.04 

-.12 

6 

1.00 

.47 

7* 

1.00 

Criterion  measure 


An  index  of  forecasting  efficiency  (Guilford,  1956,  p.  398)  which 
indicated  the  degree  to  which  predictions  made  by  means  of  the  regres¬ 
sion  equation  were  better  (more  accurate)  than  those  made  merely  from 
a  knowledge  of  the  mean  of  the  criterion  measures  was  computed.  The 
index  for  the  corrected  R  was  42.6%,  which  indicated  that  use  of  the 
regression  equation  would  be  superior  to  using  the  mean  alone. 

The  regression  equation  was: 

P^  =  -1.064  +  1.245XJ  -  0.197X2  -  1.072X3  -  0.089X4 

where 

_ f 

P  =  Predicted  mean  number  of  output  units  produced  per  second; 
and 

Xj  -  X^  *  Task  characteristic  scales  #  1  through  #  4  listed  above. 

28 


-T. 


Second  Post-Diction  Study 


The  second  post-diction  study  was  based  on  data  from  the  third 
reliability  study  described  earlier  in  which  two  judges  rated  21  tasks 
on  18  scales.  The  criterion  measure  common  to  the  20  tasks  ultimately 
used  was  the  mean  percent  time  on  target  achieved  after  five  minutes  of 
practice  on  the  tasks  in  question.  These  tasks  and  their  associated 
performance  data  were  obtained  from  studies  reported  in  the  experimental 
literature.  (See  Appendix  6  for  references  to  these  studies.) 

The  need  to  reduce  the  set  of  predictors  existed  hetfe  as  in  the 
first  post-diction  study.  Accordingly,  the  same  reductive  procedure  was 
followed.  This  involved  ranking  the  18  scales  (Appendix  5)  in  terms 
of  their  reliability  and  then  selecting  the  final  subset  on  the  basis 
of  high  reliability.  This  operation  resulted  in  the  selection  of  the 
following  scales: 

1.  Number  of  procedural  steps, 

2.  Precision  of  responses, 

3.  Number  of  responses, 

4.  Number  of  output  units, 

5.  Simultaneity  of  responses,  and 

6.  Number  of  elements/output  unit. 

Table  8  presents  the  data  on  which  the  second  post-diction  study  was 
based . 

A  multiple  correlation  (R)  was  computed  using  a  stepwise  procedure. 

The  order  of  the  scales  in  the  above  list  paralleled  the  order  in  which 

the  predictor  variables  emerged  from  the  regression  analysis.  A  point 

2 

of  diminishing  returns,  in  terms  of  percent  variance  accounted  for  (R  ), 
was  reached  after  the  fourth  predictor  emerged.  Consequently,  a  re¬ 
gression  equation  was  written  using  only  the  first  four  scales  listed 
above.  The  half-diagonal  intercorrelation  matrix  for  all  seven  variables 
(six  predictors,  one  criterion)  is  presented  in  Table  9. 


29 


BASIC  DATA  FOR  THE  SECOND  REGRESSION  ANALYSIS 


o 


in 


♦ 
v 

r—* 

a 
o 
to 

A 

•  r-*  . 

to 

G 
O 

to 
G 

4-> 

.ri 

OS 

43 

to 

D 

> 


CO 


H  c 
O  d 

s£  «n 

so  « 
>  £ 
<!  <1 


.a 

m 

nJ 

H 


<0(0(vJfi)^nNN(OPlM(0(vlM!M('J  nl  «  P)  CO 


(<)(MNNOMO(M('SpIOo!OOOOONc<I(N! 


if  (S3  ■ — *  fO  M  h  h  *— i  ^  r  >0  vO 


in  m  in  in  in  in  in  in  in 

«  *  *  *  *  #  #  *  . 

'GnOiOiO  r^o  ^  -o  if,  in  if  in  in  rf  ^  if  n  <o  in  in 


m  n  in  to  10  in  in  in 

•  •  •  4,  «  #  • 

rtfjHnHMrt-it'iNHnMrtHHNifiifn 


m  ifcoifoioiifcooMcoinNO  on  n  o'  o  h 

—t  (MvOMtnififoonJM-^tocoNion^^inin 


to 

G 


to 

G 

»#-*■ 

u 

ti 

34 

H 


G 

O 

.•4 

+J 

G 

•  f-4 

-t) 

(4 

o 

o 

U 


3 

w 

34 

3 

Pi 

Q 

i 

CO 


G 

O 

•H 

G 

3 

•  ^4 

TJ 

34 

O0  O 

3  O 

3  ^ 

U  rQ 


U  4J 
43  43 

T}  U 
T)  U 

3  3  t  r4  -4  -4 

fflH  ?HM 


s  2 

O  4-> 

n  £ 

°  ° 

«  u 


34 

4) 

T3 

T) 

3 


3 

O 

•  H 

<$ 

a 

T3 

34 

O 

*>  0 
•3° 

w  *3 

5  § 

A  K 


1 

o 

2 
H 


34 

O 

44 

O 

X 


3 

O 

««-* 

■M 

3 

G 

TJ 

34 

O 

O 

o 

£ 

I 

o 


o 

cs 

u 

H 

t» 

34 

O 

4-» 

(0  34 

w  O- 

G  o 
a  « 

S  - 

"  3 

w 

34 

3 

CL 

XI 

4-> 

34 
4) 
O 


3 

O 

j-> 

ci 


U  34 
O  O 

4J  44 

O  O 
«  « 


w 

-t 

b* 

t>4 

34 

43 


3  3 
M  W 
34  34 

3  3 

CL  CL 

J3  -g 

44  44 

34  34 
43  4) 
O  O 


X 
O 
cJ 

34 

t* 
i>. 

34  G 

O  3 

«  o 

V)  r~* 

G  j? 

S  £ 

O  43 

U  P4 


3  CL 
w ’d 
34  <4 

Si 

J>.  3 

34 

B  >•» 

s-w 


CO 

d 

43 

CL 


(U 


M 

S43 

'O 
O  4) 

O  CL 


to 


.  .  . . O  rt  (VI  fO 

<vj  rG  tj<  in  v0  oo  h  h  h  h 


^  in  no  co  ^  o 

*— 8  H  i-*  rH  p— t  f\J 


30 


-t 


^The  six  scales  ware:  i..  Number  of  .procedural  steps;  2.  Precision  of  responses;,  Number 
of  responses;  4.  Number  of  output  units;  5.  Simultaneity  of  responses;  and  6.  Number  of 
elements /output  unit. 


Table  3 


INTERCORRELATION  MATRIX  FOR  THE  SECOND  REGRESSION  ANALYSIS 


-  ^  - 

„  '  .  111 

1 

2 

3 

4 

5 

6 

7* 

1 

1.00 

,.-34 

.90 

.44 

.75 

.76 

-.54 

2 

1.00 

.25 

.34 

.61 

.26 

.30 

3 

1.00 

.59 

.60 

.60 

-.41 

4 

1.00 

.38 

.22 

.07 

5 

1.00 

.8i 

-.18 

6 

1.00 

-.46 

7 

1,00 

*  Criterion  measure 


The  multiple  R  achieved  for  this  post-diction  study  was  0.79,  which 

2 

accounted  for  63%  of  the  variance  (R  ) .  This  coefficient  was  significant 

[F  (4,  15)  =  6.42,  p  <  .01].  Correction  for  small  sample  bias  yielded  a 

R  =  0.73,  which  was  also  significant  [F  (4,  15)  =  4.28,  p  <  .05].  The  index 
c 

of  forecasting  efficiency  for  this  corrected  R  was  31.7%.  This  figure  indi¬ 
cated  that  prediction  using  the  regression  equation  would  be  superior  to 
that  made  on  the  basis  of  knowledge  of  the  mean  of  the  criterion  measures 
alone. 

The  regression  equation  was: 

P*  =  -1.484  -  19.056X.  +  12.102X.,  «•  4.213X-  +  1.2S1X. 
m  i  2  3  4 

where 

_» 

PM  =  Predicted  mean  percent  time  on  targer  after  5  minutes  of 
practice;  and 

Xj  -  X^  =  Task  characteristic  scales  #1  through  #4  listed  above. 


31 


Discussion 

The, results  of  both  post-diction  studies  are  presented  for  com¬ 
parison  ir.  Table  10. 


Table  10 

COMPARISON  OF  POST-DICTION  STUDIES  1  AND  2 


Uncorrected 
2 


R 


R- 


Corrected 
•  R  R2 


Forecasting 
Efficiency"  P(CR) 


Study  1 

.85 

.72 

.82 

.67 

43% 

.01 

Study  2 

.79 

.63 

.73 

.53 

32% 

.05 

It  is  apparent  that  the  post-diction  efforts  were  successful  in 
both  cases.  The  critical  question  of  whether  these  results  would  hold 
up  in  the  face  of  cross-validation  remains  an  open  issue.  Both  studies 
provide  a  predictive  mechanism  which  had  adequate  merit  when  compared 
to  predicting  performance  on  the  basis  of  knowledge  of  only  the  means 
of  the  respective  samples. 

Consideration  of  these  results  was  interesting  in  light  of  the 
'model  of  performance  cited  earlier  in  the  report.  There,  performance 
was  viewed  as  a  function  of  the  operator,  the  task,  and  the  environ¬ 
ment.  Given  that  the  operator  and  the  environment  components  were 
essentially  "uncontrolled"  or,  at  least,  were  unknown  quantities  in  the 
studies  used  here,  it  was  not  anticipated  that  the  task  component  alone 
would  account  for  as  much  of  the  variance  (67%  and  53%)  as  it  seemingly 
did. 


The  model,  for  instance,  suggested  that  uncontrolled  variations  in 
the  operator  and  environmental  components  might  well  mask  the  relation¬ 
ship  between  task  characteristics  and  performance.  This  masking  may 


32 


indeed  have  been  present.  That  it  was  not  as  pronounced  as  expected, 
however,  may  have  been  due  to  the  fact  that  the  operator  and  environ¬ 
mental  components  were  being  indirectly  controlled  or  almost  held  con¬ 
stant.  For  example,  with  regard  to  the  environment,  it  could  be  assumed 
that  any  experimenter  would  attempt  to  ensure  that  such  conditions  as 
room  temperature,  noise  level,  level  of  illumination,  etc.,  were  at  least 
within  some  "subjective  zone  of  acceptance"  when  setting  up  his  experi¬ 
ment  unless  these  variables  were  accually  part  of  his  design.  Since 
the  studies  chosen  were  picked  s.d  as  to  avoid  the  presence  of  such 
independent  variables  as  stress,  drugs,  etc.,  it  is  reasonably  safe  to 
assume  that  the  "environment  component"  was  essentially  constant  across 
studies.  Furthermore,  the  use  of  mean  performance  scores  on  each  task 
(obtained  by  averaging  across  individuals)  tended  to  minimize  the  in«- 
fluence  of  individual  difference  variables. 

Given  the  limitations  inherent  in  the  post-diction  approach,  these 
studies  nevertheless  showed  that  selected  task  characteristics  were 
correlates  of  performance.  Use  of  the  task  definition  described  earlier 
and  of  the  descriptive  indices  derived  from  it  appeared  to  provide  a 
basis  for  systematically  relating  differences  among  tasks  to  variations 
in  performance. 


33 


CQ3DUSSCNS  ASS  2EOM&8Diaifl85 


The  work  described  In  this  report:  fez s  focused  c-3  Bet  esc  of  several 
possible  approaches  which  might  fee  pursued  In  better  orgasicing  lafer- 
sratioa  abeac  femna  performance.  Ja  the  presect  approach  ^eashs^  were 
vlewta-  as  more  than  merely  eocrealect  vehicles  to  bs  used  whea  assessing 
the  effects  of  selected  experimental  treatments  ca  perfer aatce.  Instead, 
tasks  ter.-e  treated  as  caatplexe-  cf  Iec  spend  ear  variables  which,  la  their 
oid  right,  were  capable  of  isflesnczEg  performance.  To  batter  tsadarstaad 
their  influence,  therefore,  a  language  was -developed  to  permit  objective 
and  direct  description  of  different  tasks  and  to  previse  a  Basis  foi 
comparing  and  contrasting  various  "task  treatmests"- 

TMs  effort  has  tentatively  demonstrated  that  It  is  possible  tc 
describe  tasks  in  terns  cf  s  task-characteristics  language  which  is 
relatively  free  of  the  subjective  and  Indirect  descriptors  found  in  many 
ocher  systems.  It  has  further  Jemons crated  that  the  task  characteristics 
may  represent  important  correlates  of  perfBsmacere.  Although  more 
convincing  proof  of  this  point  mast  await  cross-validation  exercises,  it 
was  possible  to  describe  subtle  differences  among  tasks  and  to  relate 
such  differences  systematically  to  variations  In  performance. 

hnile  successful  In  many  respects,  the  study  also  encountered  a 
number  of  difficulties.  First,  although  several  scales  proved  reascnably 
reliable,  many  others  did  not.  Substantial  improvement  in  this  area  is 
required  and  sigtit  result  frees  sore  Intensive  training  of  judges,  better 
definition  of  characteristics,  and/or  improved  methods  or  quantification. 
Until  higher  overall  reliabilities  can  be  obtained,  continued  use  of 
panels  of  judges  will  be  necessary.  This  procedure  is  less  attractive 
than  the  use  of  a  'ingle  rater. 

Second,  the  current  language  was  designed  so  as  to  be  applicable 
to  all  tasks,  given,  our  definition  of  a  "task'1.  The  study  indicated, 
however,  that  the  scales  in  their  present  fora  were  less  suitable  for 
the  description  of  "cognitive'1  paper-and-pencil  tasks.  It  say  be 


34 


rsecessary  to  develop  addS.sicssl  gsssrSpi&zs  .for  sMs  type  of  task,  or 
to  treat  sea  tasks  separately  sdt&lt?  an  astirely  different  descriptive 
systm- 

Third  arsd  finally,  one  cse  of  toe  descriptive  system  was  in  pre- 
dicticg  sbe  mean  level  of  perfomarce  expected  ea  different  tasks.  It 
•became  apparent,  however,  that  means rgfcl  regression  exgszzicss  cccsld  be 
developed!:  only  when  tbs  tasks  in  csestiea  shared  tbs  sane  r esgozse 
measure.  Ia  other  wards,  different  ©orations  resold  be  required  for  tasks 
oa  which  different  response  m^ssssres  were  employed. 

toe  general  ceaseaoence  of  .this  sa.t^aticse-  therefore,  is  the  seed 
for  research  which  attenpts  to  identify'  tbs  smalisst  set  of  ddstinct 
response  measures  which  can  be  used  to  represent  all  possible  rseassres. 
Teicfeser  and  Clsoa  (2S&5)  bare  suggested  fear  such  measures  (probability 
of  detection,  percent  error,  percentage  decrement  in  time  on  target, 
and  reaction  time)  which,  if  they  encompassed  a  large  proportion  cf  all 
possible  tasks,  weald  be  worth  pursuing.  A  second  consequence  bears 
directly  oa  the  language  developed  in  the  present  study.  The  possibility 
exists  of  tailoring  separate  descriptive  systems  for  use  with  different 
categories  of  tasks  (defined  in  tenas  of  response  measures)  -  bails  this 
approach  is  certainly  feasible,  and  night  actually  be  superior  were  one 
only  interested  in  a  particular  category  of  tasks,  it  was  not  adopted 
in  the  present  study  because  of  more  catholic  interests.  A  language 
was  desired  which  not  only  would  provide  a  basis  for  predicting  perfor¬ 
mance  (within  categories)  but  which  would  also  provide  for  comparisons 
of  tasks  across  different  categories - 

Much  additional  research  is  required  if  the  approach  is  to  be 
developed  to  the  fullest  extent  possible.  Two  efforts  in  particular  are 
required.  The  first  would  center  on  the  type  of  application  emphasized 
in  the  present  effort,  while  the  second  would  attenpt  to  broaden  the 
scope  of  the  approach. 


35 


First,  the  predictive  methodology  should  be  assessed  csicg  3  much 
larger  sample  of  tasks-  Ressltaat  regression  esjastlons  must  then  "to, 
errltszztd  in  formal  cross-validation  exercises.  Given  that  these  efforts 
wire  successful,  Ekw  waald  the  predictive  methodology  6e  applied?  Ideally, 
the  user  —  am  eqpUBmefet  design  erg  Steer,  2  training  specialist,  etc.  — 
wOtsld  first  identify  the  type  of  performance  nseasare  most  appropriate 
for  tfbe  new  task  is  egsesticn.  Ke  woold  risen  refer  te  a  document  containing 
a  taasder  of  regression  ©stations,  each  of  which  was  specific  to  a  particu¬ 
lar  type  of  performance  measure.  JSere  fee  would  see  which  scales  were 
involved  in  assessing  tite  type  of  performance  relevant  to  his  interest. 

Us  wsesld  then,  rate  the  near  task  ca  these  scales  and  enter  these  values 
into  the  elation  which  sewld  contain  the  appropriate  weights.  The  cut- 
pat  weald  he  a  predicted  mean  level  cf  performance  on  that  task  at 
soa esc  specified  point  in  the  learning  curve.  This  estimated  level  of 
performance  coaid  then  be  compared  tc  sosse  desired  criterion  level  of 
performance.  If  the  predicted  performance  were  inadequate  relative  to 
the  desired  level,  tee  user  would  receive  guidance  regarding  remedial 
actions,  i.e.,  redesigning  certain  aspects  of  the  task.  For  example, 
beta-weights  for  each  of  toe  terms  is  the  equation  would  indicate  the 
relative  contribution  made  by  each  -task  characteristic  to  the  predicted 
performance.  The  user  would  be  in  a  position  to  change  certain  features 
of  the  task  by  assessing  which  features  were  most  potent  versus  those 
which  were  amenable  to  change.  Having  made  these  changes  conceptually, 
he  could  rerun  the  regression  equation  using  the  new  task  values  and 
see  if  a  sufficient  ieprovesent  in  performance  had  occurred.  This 
iterative  process  could  be  accomplished  without  physically  changing  the 
equipment  until  a  change  was  warranted. 

A  second,  rather  different  application  of  the  descriptive  language 
should  also  be  studied.  The  situation  here  would  be  represented  by 
the  case  where  a  review  of  the  literature  was  conducted  to  determine 
the  general  effect  of  a  specific  "environmental"  variable,  i.p.,  sassed 
versus  distributed  -practice,  levels  of  noise,  etc.,  on  perfonaance. 
Typically,  the*  findings  of  such  a  survey  could  be  used  to  define  a  subset 


36 


of  studies  in  which,  for  example:  massed  practice  proved  superior;  a 
subset  in  which  spaced  practice  proved  superior;  and,  possibly,  a  third 
subset  which  yielded  no  difference  between  the  variables,  f Saving  cate¬ 
gorised  the  studies  in  terns  of  which  treatment  was  superior  (i.e., 
massed,  distributed,  neither),  the  tasis  used  in  those  studies  would  than 
be  rated,  and  discriminant  function  analyses  would  be  conducted  to 
determine  whether  different  task  profiles  were  associated  with  the 
various  criterion  groups.  If  such  werfc  the  case,  additional  studies 
wsuld  be  selected,  tasks  within  those  studies  would  be  rated  =  and  the 
obtained  profiles  would  be  analysed  in  order  to  predict  which  distribution 
of  practice  should  be  superior  for  a  given  task.  The  predictions  would 
be  checked  against  actual  learning  data.  If  successful,  these  efforts 
would  have  identified  those  aspects  of  tasks  which  were  beneficially 
conducive  to  the  application  of,  in  the  case  of  the  example  cited,  either 
massed  or  distributed  practice.  These  findings  would  fee  of  importance 
to  researchers  in  both  the  applied  and  theoretical  fields.  Such  sugges¬ 
tions  were  made  earlier  by  Fleishman  (1967)  and  Fleishman,  Teichner, 
and  Stephenson  (1970);  beginnings  in  this  direction  have  been  made 
under  this  project  by  Teichner  and  Whitehead  (1971)  - 

In  sumary,  in  addition  to  pursuing  the  two  major  applications  cited 
above,  the  following  activities  should  also  be  considered; 

(a)  the  development  of  descriptive  systems  for  the  operator  and 
environmental  components; 

(b)  the  development  of  a  response  taxonomy  or  classification  system 
to  reduce  the  number  of  potential  performance  measures  to  a  manageable 
set; 

(c)  a  mathematical  procedure  for  allowing  the  characteristics  of 
all  three  of  the  model's  components  to  enter  into  a  full  test  of  the 
model's  predictive  efficiency; 

(d)  the  further  development  of  the  task  characteristics  themselves 
in  the  direction  of  greater  quantification; 


37 


(e)  a  core  adequate  a  cans  of  assessing  the  types  of  "reliability" 
of  interest  to  the  rating  situation  encountered  here; 

(£/  the  development  of  a  collection  of  suitable  tasks  both  adequate 
in  number  and  type  to  permit  cross-validation;  or 

(s)  programmatic  experimental,  efforts  in  which  tasks,  operators, 
and  the  environment  can  be  systematically  varied.  - 

The  need  for  farther  development  notwithstanding*  the  present  study 
has  served  a  valuable  purpose.  It  has  demonstrated  the  essential 
validity  and  utility  of  a  rather  different  method  of  task  description. 
The  characteristics  themselves  are  not  the  only  ones,  arc  necessarily 
the  best  ones,  which  night  be  developed.  Similarly,  only  one  of  several 
possible  uses  of  the  descriptive  data  was  evaluated.  Although  the 
specifics  of  the  system  may  eventually  assume  a  very  different  form,  the 
present  study  has  demonstrated  the  soundness  of  the  under  .lying  approach. 


38 


REFERENCES 


Catteil,  R-  B-  S  Coulter,  M.  A.  Principles  of  behavioral  taxonomy  and 
the  mathematical  basis  of  the  taxonosy  computer  program.  British 
Journal  of  Mathematical  and  Statistical  Psychology,  1966,  19 
(Part  2),  257-269. 

Cotxeraan,  7.  E.  Task  classification:  An  approach  tc  partially  ordering 
information  on  banaan  learning.  Report  No.  KADC  IN  58-574,  January 
1955.  Wright  Patterson  Air  Force  Base,  Ohio. 

Farina,  A-  J-,  Jr.  Bevelopoent  of  a  taxonomy  of  hiEian  pcrfcrsance: 
Descriptive  schemes  for  huaan  task  behavior.  Report  No.  AIR- 
726-1/69-TE-2,  January  1969-  American  Institutes  for  Research, 
Washington,  D.  C. 

Fine,  5.  A-  A  functional  approach  to  a  broas  scale  nap  of  work  behaviors. 
Report  Kb-  HSR-3M-53/2,  September  1963.  Kussn  Sciences  Research, 
McLean,  Virginia. 

Fitts,  Pi  M.  Factors  in  coup lex  skill  training.  In  R.  Glasser  (Ed.), 
Training  research  and  education.  Pittsburgh:  University  of 
Pittsburgh  Press,  1962. 

Fleisfraan,  E.  A.  Performance  assessment  based  on  an  empirically  derived 
task  taxonomy.  Hasan  Factors,  1967,  9(4) ,  349-566. 

Fleishsan,  E.  A.  The  description  and  prediction  of  perceptual-sotor 

skill  learning-  In  R.  Glasser  (Ed.),  Training  research  and  education. 
Pittsburgh:  University  of  Pittsburgh  Press,  1962. 

Fleishsan,  E.  A.  6  Stephenson,  R.  K.  Development  of  a  taxonomy  of  human 
performance:  A  review  of  the  third  year's  progress.  Report  No. 
AIR-726-9/70-TPR-3,  September  1970.  American  Institutes  for 
Research,  Washington,  D.  C. ‘ 

Fleishsan,  E.  A-,  Kinkade,  R.  G.,  §  Chambers,  A.  N.  Development  of  a 
taxonomy  of  human  performance:  A  review  of  the  first  year's 
progress.  Report  No.  AIR-726-I1/68-TPR-1,  November  1968.  American 
Institutes  for  Research,  Washington,  D.  C. 

Fleishman,  E.  A.,  Teichner,  W.  H.,  5  Stephenson,  R.  W.  Development  of  a 
taxonomy  of  human  performance:  A  review  of  the  second  year's  pro¬ 
gress.  Report  No.  AIR-726-1/70-TPR-2,  January  1970.  American' 
Institutes  for  Research,  Washington,  D.  C. 

Folley,  J.  D.,  Jr.  Development  of  an  improved  method  of  task  analysis 
and  beginning  of  a  theory  of  training.  Report  No.  NAVTRADEVCEN 
1218-1,  June  1964.  U.  S.  Naval  Training  Device  Center,  Port 
Washington,  New  York. 


39 


Gagne,  R,  M„  Human  functions  in  systems.  In  R.  M.  Gagne  (Ed.),  Psy¬ 
chological  principles  in  system  development.  New  York:  Holt, 
Rinehart,  and  Kinston,  1962. 

Ginsburg,  R.,  McCullers,  J.  C.,  Merryaan,  J.  J.,  Thomson,  C.  W.,  § 

Whitte,  R.  S.  A  review  of  efforts  to  organize  information  about 
huaan  learning,  transfer,  and  retention.  San  Jose,  California: 

San  Jose  State  College,  1966. 

Guilford,  J.  P.  Fundaaental  statistics  in  psychology  and  education. 

New  York:  McGraw-Hill,  1956- 

Kacksan,  J.  R.  Tasks  and  task  performance  in  research  on  st-ress.  In 
J.  E.  McGrach  (Ed.),  Social  and  psychological  factors  on  stress. 

New  York:  Holt,  Rinehart,  and  Kinston,  1968. 

McCormick,  E-  J.  Job  dimensions:  Their  nature  and  possible  uses. 

Paper  presented  at  the  International  Congress,  of  Applied  Psychology, 
Amsterdam,  August  1968. 

Melton,  A.  W.  6  Briggs,  G.  E.  Engineering  psychology.  Annual  Review 
of  Psychology,  1960,  11,  71-98. 

Miller,  R.  B.  Task  taxonomy:  Science  or  technology?  International 
Business  Machines,  Poughkeepsie,  New  York,  1966. 

Miller,  R.  B.  Task  description  and  analysis.  In  R.  M.  Gagne  (Ed.), 
Psychological  principles  in  system  development.  New  York: 

Holt,  Rinehart,  and  Kinston,  1962. 

Reed,  L.  E.  Advances  in  the  use  of  computers  for  handling  human  factors 
task  data.  Report  No.  AMRL-TR-67-16,  April  1967.  Wright 
Patterson  Air  Force  Base,  Ohio. 

Silverman,  J.  New  techniques  in  task  analysis.  Report  No.  SRM  68-12, 

1967.  U.  S.  Naval  Personnel  Research  Activity,  San  Diego,  Calif. 

Smith,  P.  C.  §  Kendall,  L.  M.  Retranslation  of  expectations:  An 

approach  to  the  construction  of  unambiguous  anchors  for  rating  scales. 
Journal  of  Applied  Psychology,  1963,  47 (2) ,  149-155. 

Stolurow,  L.  M.  A  taxonomy  of  learning  task  characteristics.  Report 
No.  AMRL-TDR-64-2,  1964.  Wright  Patterson  Air  Force  Base,  Ohio. 

Teichner,  W.  H.  6  Olson,  D.  Predicting  human  performance  in  space  en¬ 
vironments.  NASA  Report  No.  CR-1370,  1969.  National  Aeronautics 
and  Space  Administration,  Washington,  D.  C. 


< 


40 


Teiehner,  W.  H.  5  Whitehead,  J.  Development  of  a  taxonomy  of  human 
performance:  Evaluation  of  a  task  classification  system  for 
generalizing  research  findings  from  a  data  base.  Report  No. 

AIR- 726/2035-4/71 -TR-8,  April  1971.  American  Institutes  for 
Research,  Washington,  D.C.  (U.  S.  Army  Behavior  and  Systems 
Research  Laboratory  Research  Study  71-8.) 

Theologus,  G.  C,,  Roaashko, .R.,  6  Fleishman,  E.  A.  Development  of  a 
taxonomy  of  human  performance:  A  feasibility  study  of  ability 
dimensions  for  classifying  human  tasks.  Report  No.  AIR-726-1/70- 
TR-5,  January  1970.  American  Institutes  for  Research,  Washington, 

D.  C. 

Wheaton,  G.  R.  Development  of  a  taxonomy  of  human  performance:  A  review 
of  classificatory  systems  relating  to  tasks  and  performance. 

Report  No.  A1R-726-12/68-TR-1,  December  1968.  American  Institutes 
for  Research,  Washington,  D.  C. 


Winer,  B.  J.  Statistical  principles  in  experimental  desif 
McGraw-Hill,  1962. 


New  York: 


41 


APPENDICES 


Page 


Appendix  I.  Scales  Used  in  the  3-Judge  Study  45 

2.  37  Tasks  Used  in  the  3-Judge  Study  67 

3.  Scales  Used  in  the  28*  Judge  Reliability  Study  73 

4.  Tasks  Used  in  the  28-Judge  Reliability  Study  91 

5.  Scales  Used  in  the  2-Judge  Study  :  97 

6.  Tasks  Used  in  the  2-Judge  Study  115 


43 


APPENDIX  1 


SCALES  USED  IN  THE  3-JUDGE  STUDY 


This  section  contains  the  19  scales  used  in  the  3-judge  study. 
Asterisks  identify  the  subset  of  these  scales  which  were  ultimately 
entered  into  the  multiple  regression  analysis. 


45 


Preceding  page  blank 


TASK  CHARACTERISTICS  ANSWER  SHEET 


Rater's  Name  ___ _ . _ , 

Date  Rating  Performed _ 

l^ame  and  Number  of  Task  Rated 


Instructions 

There  are  19  rating  scales.  Each  task  should  be  rated  on  all  19 
scales.  As  you  assigned  a  scale  value  to  the  task,  write  down  the  scale 
value  on  the  line  for  that  rating  scale  as  listed  below.  There  is  space 
at  the  bottom  for  you  to  describe  any  problems  you  had  in  applying  the 
scales  to  the  task. 


*1.  Number  of  output  units _ 

*2.  Duration  for  which  an 

output  unit  is  maintained 

3.  Number  of  elements 

per  output  unit _ 

4.  Work  load  imposed 

by  task  goal  . 

5.  Difficulty  of  goal 

attainment _ 

6.  Precision  of  responses  _ 

7.  Rate  of  responding _ 

8.  Amount  of  muscular  effort 

involved  in  responses _ 

*9.  Simultaneity  of  response  __ 


*10.  Number  of  procedural  steps _ 

11.  Dependency  of  prof edural 

steps  _ 

12.  Adherence  to  procedures  _ _ 

13.  Procedural  complexity _ 

*14.  Variability  of  stimulus 
location _ _ 

*15.  Stimulus  or  stimulus -complex 
duration _ 

16.  Regularity  of  stimulus 

occurrence  _ 

17.  Degree  of  operator  control _ 

18.  Reaction  time /feedback  lag  __ 

19.  Decision-making _ 


Problems/  Comments 


47 


Preceding  page  blank 


*  1.  NUMBER  OF  OUTPUT  UNITS 


An  output  unit  is  specified  or  implied  in  the  statement,  of  the,  task 
goal.  Output  units  are  often:  an  assembly  of  objects,  a  stimulus -control 
relationship,  or  a  specifiable  end-product  (e.  g.  ,  arrival  at  B  in  the  task, 
run  from  A  to  B).  You  are  to  judge  the  number  of  output  units  specified 
or  implied  by  the  task  goal  relative  to  other  quotas  which  could  be 
established  for  the  same  type  of  task. 


Definition 


Examples 


As  many  as  •possible  -  as  many  - 
output  units  as  Ipossible  are  to  be 
produced,  usually  during  a  fixed 
period  of  time. 


^•Insert  as  many  plugs  into  the 
connectors  as  possible  in  five 
minutes 

f#Do  200  push-ups  in  five  minutes 
_*Do  200  push-ups. 


Moderate  number  -  relative  to 
other  possible  quotas  for  the  same 
type  of  task;,  a  moderate  number 
of  output  units  is  to  be  produced. 


£^l\o^t«'enty  push-ups  in  five  minutes. 
|  {twenty  push-ups. 


7 


2 

One  output  unit  -  one  output  unit  is 
to  be  produced.  Itris  either  main¬ 
tained  or  signals  the  termination 
of  performance.  J. 


•Assume  a  push-up  position. 

Maintain  it  for  five  minutes. 

•Do  one  push-up. 

f#Add  the  following  list  of  integers. 


*2.  DURATION  FOR  WHICH  AN  OUTPUT  UNIT  IS  MAINTAINED 


Once  the  operator  has  produced  an  output  unit,  he  may  be 
required  to  maintain  or  continue  it  for  one  of  several  time  periods. 
For  example,  it  can  be  maintained  for  as  long  as  possible;  or,  its 
completion  may  be  a  signal  to  leave  it  and  go  on  to  produce  the  next 
output  unit;  or,  finally,  having  produced  it,  performance  ends. 

Decide  where  the  present  output  unit  belongs  on  the  below 

scale. 


Definition 


Examples 


Maintenance  for  as  long  as 
possible  -  an  output  unit  (body 
position,  stimulus -control  re¬ 
lationship,  etc. )  is  to  be  main¬ 
tained  for  as  long  as  possible. 


•  Hang  in  a  bent-arm  position  for 
f-as  long  as  possible. 


6 


~  ^Maintain  a  stimulus -control 
'relationship  (target  and  cursor) 
for  20  minutes. 


5 


Moderate  maintenance  -  relative  4 _ ^Maintain  a  stimulus -control 

to  other  possible  periods  of  main-  relationship  for  five  minutes, 

tenance,  an  output  unit  is  to  be 
maintained  for  a  moderate  period 
of  time. 


3 


2 


^•Do  as  many  push-ups  as  possible 
in  ten  minutes  holding  each  "down" 
position  for  30  seconds. 


■  production  of  J 
an  output  unit  signals  termination 
of  performance  or  production  of 
additional  units.  Maintenance, 
therefore,  is  minimal. 

U9 


f-*Do  as  many  push-ups  as  possible 
in  two  minutes. 

— *Solve  the  following  trigonometric 
problems. 


3.  NUMBER  OF  ELEMENTS  PER  OUTPUT  UNIT 


One  way  of  describing  an  output  unit  is  in  terms  of  the  number 
of  elements  involved  in  its  production.  By  elements  we  mean  the 
objects  or  components  which,  when  assembled*  comprise  the  output 
unit.  In  an  addition  probler  ■>,  for  example,  the -numbers  to  be  added- 
are  the  elements  which  comprise  the  output  unit. 

Rate  the  present  task  in  terms  of  the  number  of  elements 
forming  an  output  unit  on  the  scale  below. 


Definition 

Many  elements:  each  output  unit  7 
contains  many  c  onstituent  elements 


Examples 


•Assemble  a  radio  fi  jm  <he 
■  components  in  this  kit. 


6 


Moderate  number  of  elements: 
each  output  unit  contains  several 
constituent  elements. 


^•Change  a  flat  tire. 

•Rank  order  these  20  items. 


3  4 


2 


One  element:  each  output  unit 
contains  only  one  constituent 
element. 


•fc^Push  the  button  when  the 
light  comes  on. 


50 


4.  WORK  LOAD  IMPOSED  BY  TASK  GOAL 


Work  load  is  judged  in  terms  of  tl.e  number  of  output  units  to 
be  produced  relative  to  the  amount  of  time  allowed  for  their  protection, 
i.  e. ,  output  units  per  time. 

There  are  those  tasks  in  which  the  goal  is  to  maintain  a  situation, 
e.  g. ,  stay  within  40  feet  of  the  vehicle  ahead  of  you,  rather  than  pro¬ 
duce  multiple  output  units.  For  those  tasks,  the  degree  of  work  load 
is  directly  related  to  the  length  of  time  for  which  maintenance  is 
required. 

Rate  the  present  task  on  the  scale  below. 


Definition 


Examples 


High  work  load  -  as  many  output  f 
units  as  possible  are  to  be  pro¬ 
duced  in  a  fixed  period  of  time; 
a  relatively  large  number  of  output 
units  is  to  be  produced  in  a 
relatively  short  period  of  time;  an 
output  unit  is  to  be  maintained  for 
a  relatively  long  time  or  for  as 
long  as  possible. 


.•Drive  as  many  nails  as  possible 
’in  five  minutes. 

•  Maintain  a  stimulus-control 
relationship  'r>;  one  hour 

♦  Maintain  a  stimulus -control 
relationship  as  long  as  possible. 


5 


Moderate  work  load  -  a  moderate  ^ 
number  of  output  units  is  to  be 
produced  in  a  reasonable  period  of 
time;  an  output  unit  is  to  be  main¬ 
tained  for  a  moderate  period  of 
time  relative  to  other  possible 
periods. 


.•Drive  ten  nails  in  five  minutes. 
\  Maintain  a  stimulus -control 
relationship  for  three  minutes. 


3 


2-- 

•  Drive  these  two  nails  in  the  next 
five  minutes. 

(-•Sum  the  following  five  integers. 

•Maintain  a  stimulus -control 
Low  work  load  -  a  small  number  i  relationship  for  30  seconds, 

of  output  units  is  to  be  produced 
in  a  relatively  long  period  of  time; 
an  output  unit  is  to  be  maintained 
for  a  relatively  short  period  of 
time. 


SI 


5.  DIFFICULTY  OF  GOAL  ATTAINMENT 


Difficulty  of  goal  attainment  is  a  function  of  two  things:.  I)  the 
number  of  elements  in  an  output  unit,  and  2)  the  degree  of  work  load 
(both  these  terms  have  been  previously  defined).  The  greater  the 
work  load  and  the  higher  the  number  of  elements,  the  more  difficult 
is  the  goal. 


Definition 

High  difficulty  -  not  only  is  the 
work  load  high,  but  the  number 
of  elements  in  an  output  unit 
is  also  high. 


Example  s 


(—•You  have  two  days  in  which  you 
are  to  assemble  as  many  radio 
kits  as  possible. 


6 


5  4- 


Mcderate  difficulty  -  botn  the  work  4- 
load  and  the  number  of  elements  in 
an  output  unit  are  moderate;  this 
combination  results  in  a  task  of 
average  difficulty.  Or,  one  measure 
is  high  and  the  other  is  low,  thus  3 
yielding  a  moderate  average. 


■f*Build  two  small  shipping  crates 
today. 


t  *Paint  the  walls  of  this  room. 
Take  as  much  time  as  you  need. 


Low  difficulty  -  relative  to  other  1 
possib.*;  values,  work  load  and 
element  number  are  both  very 
low. 


52 


6.  PRECISION  OF  RESPONSES 


Tasks  may  be  differentiated  with  respect  to  the  degree  of  precision 
associated  with  overt  observable  responses.  Degree  of  precision  or 
motor  control  required  will  increase  as  target  size  decreases,  lag  in 
controls  increases,  rate  of  change  in  stimulus  increases,  etc.  You  are 
to  judge  the  degree  of  precision  required  in  overt  responses. 


Definition 

High  degree  of  precision  -  because  7~ 
of  small  targets,  fine  scales, 
sensitive  controls,  etc.  the  subject 
must  make  responses  which  are 
extremely  precise. 

6 


5 


Moderate  precision  -  relative  to  the  — 
definitions  above  or  below,  a 
moderate  degree  of  precision  must 
accompany  subject's  responses. 

3 


Examples 


*  Using  a  chemical  balance  (scales) 
^“"determine  the  weight  of  the 

following  objects  to  the  nearest 
microgram, 

•  Replace  the  mainspring  in  this 

W7  *  •  * 


r Solder  these  two  wires  together. 
Using  your  pencil,  trace  this  maze. 


Low  degree  of  precision  -  because 


I 


•Do  twenty  push-ups. 
f*Sort  the  oranges  and  lemons  into 
two  piles. 


of  large  targets,  gross  scales,  in¬ 
sensitive  controls,  etc.  the  subject 
can  make  responses  which  are 
gross  or  imprecise. 


S3 


7.  RA  TE  OF  RES  PONDING 


Goal-directed  responses  can  be  emitted  at  different  rates.  You 
ire  *x>  judge  the  rate  of  responding  in  a  particular  task  by  considering  other 
rates  which  are  possible  for  that  same  task. 


Definition 


Examples 


High  rate  of  responding  -  many  ' 
responses  are  required  per  unit 
time  relative  to  other  rates  which 
could  be  employed  for  the  same 
task.  Responses  are  often  repetitive 
or  serial.  In  the  extreme,  they  are  6 
continuous . 


f-*Fire  20  rounds  as  quickly  as 
possible, 

•  Complete  this  jig-saw  puzzle  as 
fast  as  you  can. 

•  Track  this  target. 


Moderate  rate  of  responding  -  a 
moderate  number  of  responses 
are  required  per  unit  time. 


♦-♦Fire  20  rounds.  Fire  rapidly 

but  also  be  as  accurate  as  you  can. 
•  You  have  half  an  hour  to  complete 
this  20  item  ’’True-False"  quiz. 


24 


Low  rate  of  responding  -  few 

responses  are  emitted  per  unit 
time.  Responses  are  often  sin¬ 
gular. 


~  ( — 

•  Add  the  following  numbers.  Take 
all  the  time  you  need. 


8.  AMOUNT  OF  MUSCULAR  EFFORT  INVOLVED  IN  RESPONSES 

in  "  «:lS  ^ired 

weight-lifting to^simfde  “ 


Definition 

High  amount  of  muscular  effort- 
response (s)  require  a  high  degree 
of  muscular  involvement. 


Examples 
~^Do  20  pushups. 

•  Lift  the  heaviest  weight  possible. 


Moderate  amount  of  muscular 
effort  required  for  the  responsefs) 


Low  amount  of  muscular  effort 
required 


•  lighten  nuts  on  bolts  securely. 


•bolder  two  wires  together 
•Add  numbers  and  report  the 
sum  aloud* 


55 


*9.  SIMULTANEITY  OF  RESPONSES 


An  overt  response  or  sequence  of  responses  leading  to  the 
production  of  an  output  unit  may  involve  one  or  more  effectors 
(hands, -arms,  legs,  feet,  voice,  etc.).  These  effectors  may  or 
may  not  be  used  simultaneously. 

You  are  to  rate  the  degree  of  simultaneity  involved  in  using 
the  effectors  needed  in  the  response (s)  leading  to  production  of 
an  output  unit. 

Definition  Examples 


High  simultaneity  -  responses  in¬ 
volve  the  simultaneous  use  of 
several  effectors  on  a  fairly 
continuous  basis. 


7 


6 


*3vou  are  to  fly  this  plane  at  400 
kno's  and  an  altitude  of  5,  000 
feet,  u'nking  to  the  left  and  to 
the  right. 

cPlay  this  son,;  on  the  piano. 


Moderate  simultaneity  -  responses 
involve  th".  simultaneous  use  of 
at  least  two  effectors  on  a 
continuous  or  periodic  basis. 


5  4- 


4* Pat  your  head  and  rub  your  stomach. 
•Hit  that  target  by  firing  your  rifle. 


3 


2 


Low  simultaneity  -  responses  in-  *  »Push  the  button  when  the  light 

volve  the  use  of  only  one  effector  comes  on. 

at  a  time.  If  other  effectors  are 

employed,  they  are  employed 

sequentially. 


*10.  NUMBER  OF  PROCEDURAL  STEPS 


Earlie**  were  concerned  about  the  number  of  elements,  i.  e.  , 
objects  oT  components,  involved  in  the  production  of  one  output  unit. 
Now  we  want  to  consider  the  number  of  procedural  steps  (responses) 
needed  to  produce  one  output  unit.  There  isn't  a  necessary  one- 
to-oue  relationship  between  objects  itnd  responses. 

Consider  the  number  of  responses  or  steps  involved  in 
producing  one  output  unit, for  the  present  task.  Rate  this  task  on  the 
scale  below. 

Definition  Examples 


Large  number  of  steps  -  the 
procedure  consists  of  a  large 
number  of  constituent  steps. 


•Build  a  crystal  receiver  set 
*"?ollowing  the  enclosed  instructions. 


5 


Medium  number  of  steps  -  the 
procedure  contains  a  medium 
number  of  steps  relative  to  other 
procedures. 


•Solve  the  equation  X  -  4X  +  4  =  0. 
'•Type  the  following  business  letter. 


3  -J 


2 


Small  number  of  steps  -  the 
procedure  consists  of  few  steps. 
At  a  minimum,  only  one  step  may 
be  necessary. 


•  Open  this  combination  lock 
(32L  -  43R  -  10L). 

(•Press  the  button  whenever  the 
light  comes  on. 


57 


11.  DEPENDENCY  OF  PROCEDURAL  STEPS 


Consider  again  the  number  of  steps  involved  in  producing  one 
output  unit.  The  steps  may  be  described  in  terms  of  the  dependency 
among  them;  dependency  concerns  the  extent  to  which  the  steps  must 
be  done  in  some  specified  order.  For  example,  dependency  exists 
between  steps  A  and  B  if  step  B  cannot  be  accomplished  without  step 
A  being  done  first.  Note:  Procedures  which  have  only  one  step  are 
automatically  low  in  dependency. 

Definition  Examples 


High  dependency  among  steps  -  f 
each  step  in  the  procedure  is  com¬ 
pletely  dependent  upon  the  pre¬ 
ceding  procedural  step.  Systematic 
ordering  of  steps  is  at  a  maximum. 


•Using  the  combination  you've  been 
*given,  open  the  safe. 

•Dial  this  telephone  number. 


Moderate  dependency  among  steps  -  , 
in  the  total  number  of  steps  com¬ 
prising  the  procedure,  approximately 
50%  are  dependent  upon  preceding 
steps. 


j^Using  colored  blocks,  stack 
them  into  columns  four  blocks  high. 
Do  this  in  the  order  red  and  green 
for  the  first  two  blocks.  The 
remaining  blocks  may  be  of  any 
color. 


•  Using  colored  blocks,  stack 
{-them  into  columns  four  blocks  high. 
Order  of  color  is  unimportant. 


Low  dependency  among  steps  -  ^ 

procedural  steps  are  not  organized 
in  any  particular  sequence.  Step  "A" 
may  precede  "B"  or  "B"  may  precede 
"A1'.  Procedures  having  one  step  are 
low  in  dependency. 


F8 


12.  ADHERENCE  TO  PROCEDURES 


Tasks  may  vary  in  the?  extent  to  which  the  operator  must 
faithfully  adhere  to  the  piocedures  set  forth.  In  some  types 
of  tasks  strict  adherence  is  critical;  in  others,  the  operator  may 
depart  somewhat  from  stated  procedures  without  jeopardy  to  the 
performance. 

Judge  the  degree  of  adherence  to  stated  procedures  for  the 
present  task. 


Definition 


Examples 


7. 


High  -  strict  adherence  to  procedures 
with  even  small  departures  being 
discouraged  or  having  detrimental 
results. 


6i 


54- 


Moderate  -  Some  departures  from  4 
the  stated  procedures  are  tolerated. 


3-1 


21 


Low  -  fairly  large  departures  ^  " 

from  stated  procedures  are  tolerated. 


^Firing  an  M-l  rifle  according  to 
procedures  given  by  a  Marine  D.I, 


~\+G 


iven  conventional  procedures  to 
solve  a  trigonometric  problem; 
alternative  procedures  exist  and 
can  be  employed. 


•Type  a  letter  using  whatever  pro¬ 
cedures  you  are  most  comfortable 
with. 


59 


13.  PROCEDURAL  COMPLEXITY 


Procedural  complexity  i«  a  function  of  the  number  of  steps  or 
responses  leading  to  an  output  unit  and  the  degree  of  dependency 
among  these  steps. 

Rate  the  present  task  in  terms  of  its  procedural  complexity. 


Definition 


High  complexity  -  the  procedure 
contains  many  steps.  Each  step 
is  dependent  upon  execution  of  the 
step  which  precedes  it. 


Examples 

•Without  referencing  any  notes, 
perform  a  B-52  pre -flight  check- 
list  task. 


Moderate  complexity  -  the  pro¬ 
cedure  contains  several  steps. 
Not  all  steps  are  dependent  upon 
preceding  steps,  however. 


•Check  and  if  necessary  replace 
the  following  ter.  tubes  (Tj, . . .  T10) 
in  these  10  radio  sets. 


Low  complexity  -  the  procedures  1 
consists  of  few  steps  and  there  is 
little  if  any  dependency  among  steps. 


iiWhen  the  light  comes  on,  press 
this  button  as  fast  as  you  can. 

•  Bolt  this  bracket  to  that  frame. 


60 


14.  VARIABILITY  OF  STIMULUS  LOCATION 


Judge  the  degree  to  which  the  physical  location  of  the  stimulus 
or  stimulus  complex  is  predictable  over  task  time. 


Definition 


Examples 


High  predictability  -  stimulus 
location  remains  basically 
unchanged. 


Medium  predictability  - 
location  changes  but  in  a 
known  manner  or  pattern. 


•  Stimulus  is  a  red  light  located 
on  a  display  panel. 


0  Visually  following  an  arrow  in 
flight  toward  a  target. 


Low  predictability  -  location 
changes  in  an  almost  random 
fashion. 


•  Predicting  which  leaf  will  fall 
from  a  tree  next. 


*15.  STIMULUS  OR  STIMULUS-COMPLEX  DURATION 


Consider  the  critical  stimulus  or  stimulus  complex  to  which 
the  operator  must  attend  in  performing  the  task.  Relative  to  the 
total  task  time,  for  how  long  a  duration  is  the  stimulus  or  stimulus 
complex  present  during  the  task? 


Definition 


Examples 


Long  duration  -  stimulus  would  ’  7 
remain  indefinitely. 


Medium  duration  -  stimulus  remains 
present  until  changed  (spatially,  4 
temporally,  etc. )  by  the  response 
made  to  it. 


Short  duration  -  stimulus  ceases 
prior  to  response  being  made  to  it.  1 


64- 


54- 


31 


2  X 


^•Drawing  a  picture  by  observing 
a  model  of  the  object  being  drawn. 


•Red  light  goes  out  when  operator 
pushes  a  button. 


•  Operator  must  identify  words  or 
^targets  presented  tachistoscopically. 


62 


16.  REGULARITY  OF  STIMULUS  OCCURRENCE 


Consider  the  critical  stimulus  or  stimulus-complex  to  which  the 
operator  must  attend.  Does  it  occur  at  regular  (i.  e.  ,  equal)  intervals 
or  at  irregular  intervals.  Treat  equal  intervals  and  constant  presence 
of  the  stimulus  as  equivalent  conditions. 

Rate  the  present  task  on  this  dimension. 


Definition 


Example  s 


High  regularity  -  regular  intervals^ 
periodic  occurrence.  Also  refers 
to  stimulus  which  is  constantly 
present. 


Medium  regularity  -  Irregular 
intervals  but  a  perceivable  pattern 
of  occurrence. 


Low  regularity  -  Very  irregular 
intervals;  stimulus  occurrence  is 


6  -f- 


5  +■ 


34- 


24 


•Responding  to  units  or.  an 
assembly  line. 


•  Looking  at  a  picture  on  a  wall. 


{-•Receiving  morse  code. 


J  ___i<rr*Detecting  random  signals  on  a 
CRT  display. 


aperiodic. 


63 


17.  DEGREE  OF  OPERATOR  CONTROL  OVER  THE  OCCURRENCE 
OF  THE  STIMULUS  AND  THE  RESPONSE 

Does  the  operator  determine  when  the  stimulus  appears  (e.g.  , 
self-controlled)  or  is  the  occurrence  of  the  stimulus  externally- 
controlled?  Given  the  occurrence  of  the  stimulus,  must  the  operator 
respond  immediately  (externally-controlled)  or  may  he  respond  at  will 
(self-controlled)  ? 

Based  on  these  two  decisions,  rate  the  task  in  question  on  the 
following  scale. 


Defi  nition 


Examples 


High  subject  control  -  (both  7 

stimuli  and  responses  are  self- 
paced). 


•  Reading  aloud  to  oneself. 


5 


Medium  subject  control  -  (either 
the  stimuli  or  the  responses  are 
self-paced). 


■•Shooting  skeet  (shooter  de¬ 
termined  when  'bird*  appears) 


Low  subject  control  -  (both 
stimuli  and  responses  are  ex¬ 
ternally  paced). 


1 


^Typical  reaction  time  task. 


64 


1&.  REACTION  TIME/FEEDBACK  LAG  RELATIONSHIP 

What  relationship  exists  between  the  operator's  reaction  time 
interval  (i.  e, ,  the  time  between  stimulus  appearance  and  initiation  of  the 
operator's  response)  and  the  time  lag  interval  occurring  before  feedback 
(i.  e. ,  knowledge  of  the  effects  of  the  response)  begins?  Note  carefully 
that  the  two  intervals  of  interest  are  formed  by  the  initiation  of  the 
stimulus,  response,  and  feedback,  e.  g. , 

t -  A - -  B - ^ 

i  (Reaction  Time)  i  (Feedback  Lag)  | 

I  I  i 

Stimulus  Response  Feedback 

Initiation  Initiation  Initiation 


TIME 


Definition 

A  >  B  -  Reaction  time  (A) 
exceeds  feedback  lag  (B) 


7 


Examples 

^  •  Subject  listens  to  a  series  of 

digits  and  repeats  them  after 
a  20-second  delay. 


A  =  B  -  Reaction  time  (A) 
equal  to  feedback  lag  (B ) 


•  Subject  presses  button  to  turn 
off  red  light  when  it  comes  on. 
Light  goes  out  when  button  is 
pressed. 


A{  B  -  Reaction  time  (A)  2' 

is  shorter  than  feedback  lag  (B) 


1 


4 


•Subject  answers  a  question  on 
a  paper-and-pencil  test;  gets 
results  at  end  of  test. 


65 


19.  DECISION-MAKING 


The  task  instructions  guide  the  operator  in  producing  an  output 
unit.  Frequently,  the  steps  leading  to  the  output  unit  are  not  of  an 
"A  — B  — C"  nature,  but  instead  they  involve  choice-points  where  the 
operator  must  decide  which  of  several  potential  steps  should  be  done 
next.  He  bases  his  choice  on  the  outcome  of  the  last  step.  For 
example,  the  instructions  might  say,  "Press  button  A  and  observe 
the  outcome;  if  a  red  light  comes  on,  throw  the  switch.  If  the  blue 
light  comes  on,  throw  the  blue  switch.  "  The  key  feature  of  this 
situation  is  that  the  operator  must  decide  what  to  do  next  on  the  basis 
of  the  feedback  or  outcome  of  his  last  response. 

Rate  the  present  task  on  the  extent  to  which  it  contains  choice- 
points  in  the  steps  leading  to  an  output  unit. 


Definition 


Examples 


High  decision-making  -  more 
than  75%  of  the  steps  involved  in 
the  production  of  an  output  unit 
consist  of  choice -points. 


•Trouble  shooting  a  piece  of 
electronic  gear 


6 


•  Diagnosing  an  illness 


5 


Moderate  decision-making-  . 

approximately  half  of  the  " 

steps  involved  in  the  pro¬ 
duction  of  an  output  unit  consist  of 
consist  of  choice-points 


3 


Low  decision-making 
fewer  than  25%  of  the  steps 
involved  in  the  production  at 
an  output  unit  consist  of 
choice -points. 


♦  Reciting  a  short  verse 
by  memory 


66 


APPENDIX  2 


37  TASKS  USED  IN  THE  3-JUDGE  STUDY 


These  tasks  were  drawn  primarily  from  a  study  by  Fleishman  (1954) . 
The  raters  were  provided  with  a  two-page  description  of  each  task 
which  contained  (a)  a  picture  of  the  apparatus;  (b)  a  verbal  des¬ 
cription  of  the  basic  task;  and  (c)  the  actual  instruction  read 
to  the  subject.  Two  examples  of  such  tasks  are  presented  in  their 
entirety  in  this  appendix,  along  with  a  listing  of  all  37  tasks  by 
name  and  source.  Double  asterisks  (**)  indicate  the  subset  of  26 
tasks  which  ultimately  entered  the  multiple  regression  analysis. 


67 


TASK  1 


Apparatus 


Description 


The  S  is  seated  before  a  long  rectangular  boxlike  apparatus 
containing  two  openings.  Each  opening  is  the  entrance  to  a  straight 
passageway  which  S  must  negotiate  with  a  long  stylus.  He  moves 
the  stylus  forward  at  slightly  below  shoulder  height  and  at  arm's 
length.  He  must  move  the  stylus  slowly  and  steadily  away  from  his 
body,  trying  not  to  hit  the  sides  of  the  cylindrical  passage.  As  he 
reaches  the  end  of  the  passage  he  strikes  a  contact  point  and  with¬ 
draws  the  stylus,  again  trying  to  avoid  hitting  any  part  of  the  passage¬ 
way.  He  then  negotiates  the  second  passageway.  Two  complete  ne¬ 
gotiations  constitute  a  trial.  Counters  record  the  number  of  contacts 
and  clocks  record  the  amount  of  time  in  contact.  Six  trials,  no  time 
limit. 


Instructions 


Your  task  is  to  move  this  stylus  slowly  and  carefully  arms  length 
through  the  openings.  You  are  to  do  this  without  touching  the  sides  of 
the  passageway  with  the  stylus.  When  the  stylus  makes  contact  with  the 
end  of  the  passageway,  withdraw  it  carefully  and  slowly  without  touching 
the  sides.  When  you  have  moved  the  stylus  in  and  out  of  opening  No.  1, 
move  to  opening  No.  2  and  repeat  the  procedure.  After  moving  in  and 
out  of  the  second  passageway,  place  the  stylus  beside  the  machine  and 
rest  until  told  to  continue.  You  will  repeat  the  procedure.  Are  there 
any  questions? 

Remember  to  keep  the  stylus  at  arms  length  at  all  times  and  to 
move  as  carefully  as  possible  to  reduce  errors  which  is  time  you  con¬ 
tact  the  sides  of  the  passageway.  Begin  when  I  say  'Start1. 


69 


Preceding  page  blank 


TASK  3 


Apparatus 


Description 

The  S  is  required  to  negotiate  an  irregular  slot  pattern  with  a  T- 
shaped  stylus.  He  sits  at  arm's  length  from  the  apparatus  box  and 
moves  slowly  and  steadily  through  the  pattern  from  right  to  left, 
depresses  a  plunger  at  the  end  of  the  pattern  with  his  stylus,  and  then 
returns  through  the  pattern.  This  constitutes  one  trial. 

Errors  are  recorded  each  time  any  part  of  the  stylus  touches 
the  top,  bottom,  or  back  of  the  slot.  Four  trials,  no  time  limit. 


Instructions 


Your  task  is  to  move  the  stylus  at  arms  length  slowly  and  carefully 
through  this  slot.  You  are  to  do  this  without  allowing  the  stylus  to  touch 
the  top,  bottom,  or  inside  of  the  slot.  Any  time  the  stylus  touches  any 
part  of  the  metal  plate  around  the  slot,  errors  will  be  automatically 
counted  against  you.  The  red  light  tells  you  when  you  are  making  errors. 
When  you  get  to  the  end  of  the  slot,  push  in  on  the  little  plur.ger  with 
your  stylus,  and  then  retrace  the  pattern  without  removing  the  stylus 
from  the  slot.  When  you  have  completed  tracing  back  through  the  slot, 
put  your  stylus  down  and  place  your  hand  in  your  lap.  Rest  until  told  to 
begin. 


Remember,  it  is  important  that  you  move  slow  enough  so  that  you 
may  avoid  hitting  any  part  of  the  slot. 

Are  there  any  questions? 

Pick  up  the  stylus  and  begin  when  the  green  light  goes  on. 


70 


TASK  LISTING* 


1.  Precision  Steadiness 

2.  Steadiness  Aiming 

3.  Tracking  Tracing 
**  4.  Two -Plate  Tapping 
**  5.  Ten-Target  Aiming 
**  6.  Visual  Reaction  Time 

**  7.  Minnesota  Rate  of  Manipulation- Turning 
**  8.  Purdue  Pegboard- Right  Hand 

9.  Rotary  Pursuit 

10.  Complex  Coordination 
**11.  Key  Tapping 

**12.  Rotary  Aiming 

**13.  Hand-Precision  Aiming 

**14.  Auditory  Reaction  Time 

**15.  Minnesota  Rate  of  Manipulation -Placing 

**16.  Purdue  Pegboard-Two  Hands 

**17.  Purdue  Pegboard -Assembly  Test 

**18.  O'Connor  Finger  Dexterity 

**19.  Santa  Ana  Finger  Dexterity 

**20.  Pin  Stick 

**21.  Dynamic  Balance 

22.  postural  Discrimination 

23.  Postural  Discrimination 

24.  Discrimination  Reaction  Time 

25.  Rudder  Control 
**26.  Medium  Tapping 
**27.  Large  Tapping 
**28.  Pursuit  Aiming  I 
**29.  Pursuit  Aiming  II 


71 


**30.  Aiming 
**31.  Square  Marking 
**32.  Tracing 
33.  Steadiness 

**34.  Discrimination  Reaction  Time -Printed 

**35.  Marking  Accuracy 

**36.  Verbal  Addition  Task* 

2 

37.  Silent  Reading  Task 


♦Tasks  numbered  1  through  35  were  abstracted  from: 

Fleishman,  E.  A.  Dimensional  analysis  of  psychomotor  abilities. 
Journal  of  Experimental  Psychology,  48,  6,  1954,  437-454. 

Certain  of  these  tasks  (7,  15;  8,  16,  17;  and  22,  23)  were  used  more  than  once 
as  there  were  different  aspects  of  the  tasks  which  could  be  scored.  This 
had  the  net  effect  of  changing  the  nature  and  number  of  the  output  units  and 
certain  of  the  other  characteristics. 

♦♦Indicates  the  26  tasks  which  ultimately  entered  the  multiple  regression 
analysis. 

*This  task  was  abstracted  from: 

Mech,  E.  V.  Factors  influencing  routine  performance  under  noise: 

1.  The  influence  of  "set".  Journal  of  Psychology,  1953,  35, 
283-298. 

2 

This  task  was  abstracted  from: 

McGuigan,  F.  J. ,  &  Rodier,  W.  I.  Effects  of  auditory  stimulation  on 
covert  oral  behavior  during  silent  reading.  Journal  of  Experi¬ 
mental  Psychology,  1968,  76,  4,  649-655. 


72 


APPENDIX  3 

SCALES  USED  IN  THE  28-JUDGE  RELIABILITY  STUDY 


This  section  contains  the  16 
reliability  study. 


scales  used  in  the  28-judge 


Revision  2.  3/70 


TASK  CHARACTERISTICS  ANSWER  SHEET 


Rater's  Name  _ 

Date  Rating  Performed 
Task  Number 


Instructions 


There  are  16  rating  scales.  Each  task  should  be  rated  on  all  16 
scales.  As  you  assigned  a  scale  value  to  the  task,  write  down  the  scale 
value  on  the  line  for  that  rating  scale  as  listed  below.  There  is  space 
at  the  bottom  for  you  to  describe  any  problems  you  had  in  applying  the 
scales  to  the  task. 


1.  Number  of  output  units  _ 

2.  Duration  for  which  an 

output  unit  is  maintained 

3.  Number  of  elements  per 

output  unit _ 

4.  Work  load _ 

5.  Precision  of  responses  _ 

6.  Response  rate _ 

7.  Degree  of  muscular  effort 

involved  _ 

8.  Simultaneity  of  responses  _ 


9.  Number  of  procedural 
seeps _ 

10.  Dependency  of  procedural 

steps  _ 

11.  Variability  of  stimulus  location  _ 

12.  Stimulus  or  stimulus  complex 

duration _ 

13.  •  Regularity  of  stimulus  oc¬ 

currence  _ 

14.  Operator  control  of  the  stimulus 

15.  Operator  control  of  the  response 

16.  Rapidness  of  feedback _ 


Problems  /Comments 


74 


1.  NUMBER  OF  OUTPUT  UNITS 


The  entire  purpose  of  the  task  is  to  create  output  units.  An  output  unit 
is  the  end  product  resulting  from  the  task.  Output  units  can  take  different 
forms.  For  example,  sometimes  the  output  unit  is  a  physical  object  as¬ 
sembled  from  several  parts.  It  may  also  take  the  form  of  a  relationship 
between  two' or  more  things,  e,  g.  ,  drive  three  car-lengths  behind  the  car  in 
front  of  you.  An  output  unit  might  also  be  a  destination,  e.  g.  ,  run  from  here 
to  the  corner,  with  the  corner  being  the  destination. 

First,  identify  what  the  output  unit(s)  is  in  the  present  task.  Now,  judge 
the  number  of  such  output  units  that  someone  performing  this  task  is  supposed 
to  produce. 


Definition 

As  many  as  possible  -  as  many  7 
output  units  as  possible  are  to 
be  produced,  usually  during 
a  fixed  period  of  time. 

6 


Examples 


•  Insert  as  many  plugs  into  the 
connectors  as  possible  in  five 
minutes. 


5  -4 


Moderate  number  -  a  moderate 
number  of  output  units  is  to  be 
produced. 


9  Do  twenty  push-ups  in  five  minutes. 


3  H 


One  output  unit  -  one  output  unit  is 
to  be  produced.  It  is  either  main¬ 
tained  or  it  signals  the  termination 
of  performance. 


•  Assume  a  push-up  position  and 
maintain  it  for  five  minutes. 

•  Do  one  push-up. 

•  Add  1110  following  list  of  numbers 


75 


2. 


DURATION  FOR  WHICH  AN  OUTPUT  UNIT  IS  MAINTAINED 


Once  the  operator  has  produced  an  output  unit  he  may  be  required  to 
maintain  or  continue  it  for  one  of  several  time  periods.  For  example,  it 
can  be  maintained  for  as  long  as  possible.  Another  alternative  is  that 
completing  one  output  unit  is  a  signal  to  leave  it  and  go  on  to  produce  the 
next  output  unit.  Or,  having  produced  the  output  unit,  performance  ends. 

Decide  where  the  present  output  units  belong  on  the  below  scale. 


Definition 


Examples 


Maintenance  for  as  long  as 
possible  -  an  output  unit  (body 
position,  stimulus -control  re¬ 
lationship,  etc.  )  is  to  be  main¬ 
tained  for  as  long  as  possible. 


5 


Moderate  maintenance  -  relative 
to  other  possible  periods  of  * 

maintenance,  an  output  unit 
is  to  be  maintained  for  a 
moderate  period  of  time. 

3 


2 


Short  maintenance  -  production  of 
an  output  unit  signals  the  end 
of  performance  or  the  production 
of  additional  units.  Maintenance, 
therefore  i-  minimal  time. 


0  Hang  in  a  bent-arm  position  for 
as  long  as  possible. 


0  Maintain  a  stimulus -control  rela¬ 
tionship  for  20  minutes. 


0  Maintain  a  stimulus -control  rela¬ 
tionship  for  five  minutes. 


#Do  as  many  push-ups  as  possible  in 
ten  minutes  holding  each  "down"  posi¬ 
tion  for  30  seconds. 


•  Solve  the  following  trigonometric 
problems. 


76 


3.  NUMBER  OF  ELEMENTS  PER  OUTPUT  UNIT 


One  way  of  describing  an  output  unit  is  in  terms  of  the  number  of 
elements  involved  in  its  production.  By  elements  we  mean  the  parts  or 
components  which  comprise  the  output  unit.  In  an  addition  problem,  for 
example,  the  numbers  to  be  added  are  the  elements  which  comprise  the 
output  unit.  In  a  more  physical  task,  the  elements  could  be  parts  to  be 
assembled  or  apparatus  to  be  manipulated. 

Rate  the  present  task  on  the  scale  below  in  terms  of  the  number  of 
elements  entering  into  a  single  output  unit. 


Definition 


Example  e 


Many  elements:  each  output  7 
unit  contains  many  elements. 


6 


5 


Moderate  number  of  elements; 
each  output  unit  contains  several 
elements. 


3 


2 


One  element:  each  output  unit 
contains  only  one  element.  1 


•  Assembly  a  ratio  from  the  com¬ 
ponents  in  this  kit. 


•  Change  a  flat  tire. 

•  Rank  order  these  20  items. 


•  Push  the  button  when  the  light 
comes  on 


77 


4.  WORK  LOAD 


Wotk  load  refers  to  the  number  of  output  units  to  be  produced  relative 
to  the  time  allowed  for  their  production.  We  are  interested  in  the  ratio  of 
the  number  of  output  units  per  unit  time,  e.  g.  ,  make  5  widgets  in  10  minutes  = 
1  widget  produced  every  two  minutes. 

However,  there  are  those  tasks  in  which  the  goal  is  to  maintain  a  situa¬ 
tion  rather  than  to  produce  multiple  output  units.  For  example,  a  driving 
task  where  you  are  to  stay  within  40  feet  of  the  vehicle  ahead  of  you.  For 
these  types  of  tasks,  work  load  refers  to  the  length  of  time  for  which  main¬ 
tenance  is  required.  The  longer  the  maintenance  period,  the  higher  the 
work  load. 

Therefore,  rating  a  task  in  terms  of  work  load  resolves  to  answering 
one  of  two  questions: 

1)  How  much  has  to  be  produced  in  what  amount  of  time;  or 

2)  How  long  does  this  situation  have  to  be  maintained  or  continued? 


Definition 

High  work  load  -  as  many  •* 

output  units  as  possible  are  to  be' 
produced  in  a  fixed  period  of  time; 
a  relatively  large  number  of  output 
units  is  to  be  produced  in  a  rela¬ 
tively  short  period  of  time;  an  . 
output  unit  is  to  be  maintained  fj 
for  relatively  long  time  or  for  as 
long  as  possible. 


Examples 

•  Drive  as  many  nails  as  possible 
in  five  minutes. 

•  Maintain  a  stimulus -control 
relationship  as  long  as  possible. 


Moderate  work  load  -  a  moderate 
number  of  output  units  is  to  be  ^ 
produced  in  a  reasonable  period 
of  time;  an  output  unit  is  to  be 
maintained  for  a  moderate  period 
of  time  relative  to  other  possible 
periods. 


•  Drive  ten  nails  in  five  minutes. 

•  Maintain  a  stimulus -control 
relationship  for  three  minutes. 


Low  work  load  -  a  small  number 
of  output  units  is  to  be  produced 
in  a  relatively  long  period  of  time; 
an  output  unit  is  to  be  maintained!  « 
for  a  relatively  short  period  of  time. 


•  Drive  these  two  nails  in  the  next 
five  minutes. 

•  Sum  the  following  five  numbers. 

•  Maintain  a  stimulus -control 
relationship  for  30  seconds. 


78 


5.  PRECISION  OF  RESPONSES 


Tasks  may  differ  in  terms  of  how  precise  or  exact  the  operator's 
responses  must  be.  Judge  the  degree  of  precision  involved  in  the  present 

task. 


Definition 


Examples 


High  degree  of  precision  -  because 
of  small  targets,  fine  scales,  7 

sensitive  controls,  etc.  the  subject 
must  make  responses  which  are 
extremely  precise. 

6 


5 


Moderate  precision  -  relative  to 
the  definitions  above  or  below,  a  4 
moderate  degree  of  precision 
must  accompany  subject's  responses. 


3 


2 


Low  degree  of  prccision-becausc 
of  large  targets,  gross  scales,  in-  1 
sensitive  controls,  etc.  the  subject 
can  make  responses  which  are  gross 
or  imprecise. 


•  Using  a  chemical  balance  (scales) 
determine  the  weight  of  the  following 
objects  to  the  nearest  microgram. 

•  Replace  the  mainspring  in  this 
wrist-watch. 


•  Solder  these  two  wires  together. 

9  Using  your  pencil,  trace  this  maze. 


•  Do  twenty  push-ups. 
f  Sort  the  oranges  and  lemons  into 
two  piles. 


79 


6.  RESPONSE  RATE 


Responses  can  be  made  at  different  rates.  That  is,  the  frequency  with 
which  responses  must  be  made  can  vary  from  task  co  task.  For  example, 
you  would  have  a  higher  rate  of  responding  if  you  were  playing  a  singles  game 
of  tennis' than  if  you  were  playing  chess.  The  responses  would  come  more 
frequently  in  the  first  case  than  in  the  second.  You  are  to  judge  what  rate 
of  responding  is  called  for  in  the  task  being  judged. 


7.  DEGREE  OF  MUSCULAR  EFFORT  INVOLVED 


This  dimension  considers  the  amount  of  muscular  effort  required  to 
perform  the  *ask.  Examine  the  task  and  identify  the  most  physically 
strenuous  part  of  it.  Rate  this  part  on  the  scale  below. 


Definition 


Examples 


High  amount  of  muscular  effort- 
response(s)  require  a  high 
degree  of  muscular,  involvement. 


6 


5 


Moderate  amount  of  muscular 
effort  required  for  the  response(s) 


3 


2 


Low  amount  of  muscular  effort 
required 


•  Do  40  push  ups. 

•  Lift  the  heaviest  weight  possible. 


•  Tighten  nuts  on  bolts  securely  with 
a  wrench. 


•  Solder  two  wires  together 

•  Add  numbers  and  report  the 
sum  aloud. 


P  I 
Oi 


8.  SIMULTANEITY  OF  RESPONSES 


The  responses  which  the  operator  makes  in  producing  an  output  may 
involve  one  or  more  effectors  (e.  g.  ,  hand,  foot,  arm,  voice,  etc. ).  De¬ 
pending  upon  the  task,  these  effectors  may  or  may  not  be  used  simultaneously. 
For  example,  both  hands  (two  effectors)  are  used  simultaneously  in  playing 
a  piano. 

You  are  to  rate  the  degree  of  simultaneity  involved  in  using  the  effectors, 
needed  for  the  response(s). 


Definition 


Examples 


High  simultaneity  -  responses  in¬ 
volve  the  simultaneous  use  of 
several  effectors. 


Moderate  simultaneity  - 
responses  involve  the 
simultaneous  use  of  at  least 
two  effectors. 


*  You  are  to  fly  this  plane  at  400 
knots  and  an  altitude  of  5,  000 
feet,  banking  to  the  left  and  to  le 
right. 

•  Play  this  song  on  the  piano. 


•  Pat  your  head  and  rub  your  stomach. 

•  Hit  that  target  by  firing  your  rifle. 


Low  simultaneity  -  responses  in¬ 
volve  the  use  of  only  one  effector 
at  a  time.  If  other  effectors  are  1 
employed,  they  are  employed  se¬ 
quentially. 


•  Push  the  button  when  the  light  comes 


82 


9.  NUMBER  OF  PROCEDURAL  STEPS 


Earlier  we  were  concerned  about  the  number  of  elements  i  e  * 

jects  and  responses.  one-to-one  rda.ionship  between  ob- 


Consider  the  number  of  re 
output  unit  for  the  present  task. 


sgonses  or  steps  involved  in  producing  on. 
Rate  this  task  on  the  scale  below. 


Definition 

Large  number  of  steps  -  the 
procedure  consists  of  a  large 
number  of  constituent  steps. 


6 


5 


Medium  number  of  steps  -  the 
procedure  convaJnfi  a  medium  4 
number  of  ate  s  relative  to 
other  procedures. 


3  —I 


2  -J 


Small  number  of  st.-pc  .  the  pro_  | 
cedurc  consists  of  few  steps.  At  I 

a  minimum,  only  one  step  may  ^  — J 

be  necessary. 


Examples 

•  Build  a  color  TV  kit  following  the 
enclosed  instructions. 


*  Solve  the  equation  X2  -  4X  4  =0 

•  Type  the  following  business  letter. 


#  Open  this  combination  lock 

(32L-43R-10L). 

•  Press  the  button  whenever  the  light 
comes  on. 


83 


10.  DEPENDENCY  OF  PROCEDURAL  STEPS 


Consider  again  the  number  of  steps  (responses)  involved  in  producing 
one  output  unit.  The  steps  may  he  described  in  terms  of  the  dependency 
among  them;  dependency  concerns  the  extent  to  which  the  steps  must  be 
done  in  some  specified  order.  For  example,  dependency  exists  between 
steps  A  and  B  if  step  B  cannot  be  accomplished  without  step  A  being  done 
first.  Note:  Procedures  which  have  only  one  step  are  automatically  low 
in  dependency. 


Definition 


Examples 


High  dependency  among  steps  - 
each  step  in  the  procedure  is  * 
completely  dependent  upon  the 
preceding  procedural  step. 
Systematic  ordering  of  steps  is  at 
a  maximum. 

6 


5 


Moderate  dependency  among  steps.  - 
in  the  total  number  of  steps  com*4*  **“ 
prising  the  procedure,  approx¬ 
imately  50%  are  dependent  upon 
preceding  steps. 

3  - 


2  — J 


Low  dependency  among  slops  - 
procedural  steps  are  not  organized 
in  any  particular  sequence.  Step  J 
!A"may  precede  "3"  or  "13"  may 
precede  "A".  Procedures  having  one 
step  are  low  in  dependency. 


#  Using  the  combination  you've  been 
given,  open  the  safe. 

•  Dial  this  telephone  number. 


9  Using  colored  blocks,  stack  them  into 
columns  four  blocks  high.  Do  this  in 
the  order  red  and  green  for  the  first 
two  blocks.  The  remaining  blocks  may 
be  of  any  color. 


•  Using  colored  blocks,  stack  them  into 
columns  four  blocks  high.  Order  of 
color  is  unimportant. 


84 


11.  VARIABILITY  OF  STIMULUS  LOCATION 

Judge  the  degree  to  which  the  physical  location  of  the  stimulus  or 
stimulus  complex  is  predictable  over  task  time. 


Definition 


Examples 


High  predictability  -  stimulus 
location  remains  basically 
unchanged. 


•  Stimulus  is  a  red  light  located 
on  a  display  panel. 


Medium  predictability  - 
location  changes  but  in  a 
known  manner  or  pattern. 


§  Visually  following  an  arrow  in 
flight  toward  a  target. 


Low-predictability  -  location 
changes  in  an  almost  random 
fashion. 


•  Predicting  which  leaf  will 
fall  from  a  tree  next. 


35 


12*  STIMULUS  OR  STIMULUS  COMPLEX  DURATION 

Consider  the  critical  stimulus  or  stimulus -complex  to  which  the 
operator  must  attend  in  performing  the  task.  Relative  to  the  total  task 
time,  for  how  long  a  duration  is  the  stimulus  or  stimulus -complex  present 
during  the  task? 


Definition 


Example  s 


Long  duration  -  stimulus  would 
remain  indefinit  \y. 


6 


5 


Medium  duration  -  stimulus  4 
remains  present  until  changed 
(spatially,  temporally,  etc.  ) 
by  the  response  made  to  it. 


3 


2 


Short  duration  -  stimulus  ceases 
prior  to  response  being  made  to 

it. 


•  Drawing  a  picture  by  observing 
a  model  of  the  object  being  drawn. 


•  Red  light  goes  out  when  operator 
pushes  a  button. 


♦  Operator  must  identify  words  or 
targets  presented  tachistoscopically. 


86 


13.  REGULARITY  OF  STIMULUS  OCCURRENCE 


Consider  the  critical  stimulus  or  stimulus  complex  to  which  the 
operator  must  attend.  Does  it  occur  at  regular  (i.  e. ,  equal)  intervals 
or  at  irregular  intervals.  Treat  regular  intervals  and  constant  pre¬ 
sence  of  the  stimulus  as  equivalent  conditions. 

Rate  the  present  task  on  this  dimension. 


Definition 


Examples 


High  regularity  -  stimulus 
occurs  at  regular  intervals  or 
is  constantly  present. 


6 


5 


Medium  regularity  -  stimulus 
occurs  at  irregular  (unequal) 
intervals  but  there  is  a  pattern 
of  occurrence. 


3 


2 


Low  regularity  -  stimulus  oc¬ 
curs  at  very  irregular  (almost 
random)  intervals. 


•  Cars  coming  along  an  assembly  line. 

•  Looking  at  a  photograph  of  an  object. 


s  Receiving  morse  code. 


•  Detecting  random  signals  on  a  CRT 
di  splay. 


87 


14.  OPERATOR  CONTROL  OF  THE  STIMULUS 

What  degree  of  control  does  the  operator  have  over  either  the  occurrence 
or  relevance  of  the  stimulus? 


88 


15.  OPERATOR  CONTROL  OF  THE  RESPONSE 


Given  the  occurrence  of  the  stimulus,  what  degree  of  control  does  the 
operator  hhve  over  when  he  must  initiate  response? 


Definition 


Examples 


Full  operator  control  -  the 
operator  is  the  sole  deter-  7 

miner  of  when  the  response 
will  be  made. 

6 


5 


Partial  operator  control  -  the 
response  must  be  made  within  4 
a  reasonable  time  aTter  the 
stimulus  occurs  but  the  operator 
determines  when  within  the  interval 
the  response  will  take  place. 

3 


2 


No  operator  control  -  the 
operator  must  respond  as  soon 
as  the  stimulus  occurs. 


•  Playing  a  game  of  chess  by  yourself 
where  you  play  both  sides  and  there 
is  no  time  limit  for  responding. 


§  The  traffic  light  turns  red  when  you 
are  500  yards  from  it;  you  have 
options  as  to  when  you  will  hit  the 
brake. 


•  Typical  reaction  time  task.  When  the 
light  comes  on,  push  this  button  as 
fast  as  you  can. 


«9 


16.  rapidness  of  feedback 


For  present  purposes  the  term  FEEDBACK  refers  to  information 
which  an  operator  may  get  about  the  correctness  of  a  response.  In  this 
scale  we  are  interested  in  how  quickly  feedback  occurs  once  the  response 
is  made. 


Definition 


Examples 


Immediate  feedback  -  7 

Operator  knows  whether  the 
response  was  correct  as  soon 
as  it  was  completed. 


6 


5 


Delayed  feedback  -  operator 
receives  feedback  regarding  4 
his  responses  after  entire 
task  is  completed; 


3 


2 


No  feedback  provided  - 
Operator  never  receives  feedbac 


•  Finding  the  correct  switch  to 
turn  on  a  light. 


$  Opening  a  combination  lock  having 
five  numbers. 


90 


•  Student  takes  a  mid-term  exam  but 
is  not  told  what  grade  he  got. 


APPENDIX  4 

TASKS  USED  IN  THE  28-JUDGE  RELIABILITY  STUDY 

This  section  contains  the  15  tasks*  used  by  28  judges  in  an 
assessment  of  the  reliability  of  16  scales.  The  information 
provided  on  each  task  consisted  of:  (a)  a  picture  of  the  apparatus; 
(b)  a  verbal  description  of  the  basic  task;  and  (c)  the  actual 
instructions  read  to  the  subject.  Two  examples  of  these  tasks 
are  presented  in  their  entirety  in  this  section;  the  remainder  are 
listed  by  name  along  with  a  reference  to  the  study  from  which  they 
are  abstracted. 


* 

The  original  reliability  study  employed  20  tasks:  15  psychomotor 
and  5  paper-and-pencil  (cognitive)  tasks.  The  scales  proved  entirely 
unreliable  for  the  latter  tasks  and,  hence,  these  five  descriptions 
are  omitted  from  this  section. 


91 


TASK  LISTING* 


1.  Two-Plate  Striking 

2.  Ten  Target  Aiming 

3.  Purdue  Pegboard 

4.  Control  Sensitivity 

5.  Two-Hand  Coordination 

6.  Pursuit  Confusion 

7.  Bimanual  Matching 

8.  Visual  Reaction  Time  Test 

9.  Steadiness  Aiming 

10.  Single  Dimension  Pursuit 

11.  Complex  Coordination  Test 

12.  Tracking  Tracing 

13.  Rotary  Pursuit 

14.  Precision  Steadiness 

15.  Minnesota  Rate  of  Manipulation 


^Descriptions  and  illustrations  of  these  tasks  were  abstracted  from: 

Parker,  J.  R. ,  Jr.,  &  Fleishman,  E.  A.  Ability  factors  and  com¬ 
ponent  performance  measvres  as  predictors  of  complex 
tracking  behavior.  Psychological  Monograph,  I960,  74,  No.  503. 


93 


Preceding  page  blank 


TASK  10 


Apparatus 


/ 


Description 

The  subject  makes  compensatory  adjustments  (in  and  out  movements) 
of  a  control  wheel  in  order  to  keep  a  horizontal  line  in  a  null  position  as  it 
deviates  from  center  in  irregular  fashion.  The  control  wheel  is  damped 
pneumatically,  introducing  a  lag  into  the  system.  Score  is  the  time  the 
horizontal  line  is  held  in  a  null  position  during  the  four  1-minute  trials. 


Instructions  , 

-  f 

In  this  test  your  job  is  to  keep  this  white  line  inside  the  circle  cen¬ 
tered  between  these  two  points.  When  the  test  starts,  the  line  will  start 
to  move  out  of  position.  Your  task  is  to  keep  the  line  centered  as  it  de¬ 
viates  from  the  center.  You  can  move  the  line  up  by  pulling  out  on  this 
wheel  and  you  can  move  it  down  by  pushing  in  on  the  wheel.  Rotating  the 
wheel  has  no  effect.  Your  score  will  be  the  total  time  you  are  able  to 
keep  the  white  line  centered. 

READY? 

BEGIN? 


94 


TASK  11 


Apparatus 


Description 

The  Si  a  required  to  make  complex  motor  adjustment  of  stick  and 

pedal  controls  in  response  to  successively  presented  patterns  of  visual 
signals. 

A  correct  response  (movement  of  stick  and  rudder  controls  to 
proper  positions)  is  not  accomplished  until  both  the  hands  and  feet  have 
completed  and  maintained  the  appropriate  adjustment.  A  new  pattern 
appears  as  each  correct  response  is  completed.  Score  is  the  number  of 
completed  matchings.  Four  2-minute  test  periods. 


Instructions 


^our  ta8k  wil1  be  to  *ine  UP  a  green  light  with  each  of  the  three 
red  lights.  Moving  the  stick  from  side  to  side  moves  the  top  green 
light.  Moving  the  stick  forward  and  backward  moves  the  middle  green 
light;  and  moving  the  rudder  bar  moves  the  bottom  green  light.  Move 
toe  stick  sideways  to  match  the  top  green  light  with  the  top  red  light 
Get  it  directly  underneath.  If  it  is  off  to  one  side  like  this  it  will  not 
work.  Then  hold  the  a  tick  in  position  to  keep  toe  top  lights  matched 
while  you  move  it  forward  or  backward  to  match  the  middle  lights. 
Then  hold  the  stick  steed/  while  you  match  the  bottom  lights  with  the 


When  you  have  matched  all  three  lights,  a  new  setting  of  red  lights 
will  appear.  Go  right  ahead  and  match  the  new  setting  of  red  lights 
without  bothering  to  come  back  to  neutral. 


TASK  11  (Continued) 


If  you  move  any  of  the  controls  as  far  as  it  will  go  there  will  be 
no  green  light.  You  must  ease  back  a  bit  to  find  the  end  green  light. 

When  the  rest  starts,  you  may  use  either  your  right  or  left  hand 
on  the  stick,  but  use  only  one  hand  throughout  the  test.  Keep  your  heels 
off  the  floor.  Match  as  many  settings  of  the  lights  as  you  can  until 
go  out.  If  the  red  lights  ever  fail  to  come  on,  let  me  know  immediately. 

Your  score  will  be  the  number  of  matchings  you  can  make  in  the 
time  allowed.  Work  as  rapidly  as  you  can.  When  the  buzzer  sounds, 
the  test  period  begins.  When  all  the  lights  go  out  again,  the  test  will 
be  over. 


96 


APPENDIX  5 


SCALES  USED  IN  THE  2-JUDGE  STUDY 


This  section  contains  the  18  scales  used  in  the  2-judge  study. 
Asterisks  identify  the  subset  of  these  scales  which  were  ultimately 
entered  into  the  multiple  regression  analysis. 


TASK  CHARACTERISTICS  ANSWER  SHEET 


Rater _ 

Study  No. _ _ Author: 

Date _ 

Type  Task _ 

*1.  Number  of  output  units _ 

2.  Duration _ 

*3.  Number  of  elements /output  unit _ 

4.  Work  load _ 

*5.  Precision  of  responses _ 

6.  Response  rate _ 

7.  Tutorial  dependency _ 

8.  Natural  dependency _ 

9.  Operator  control  over  response _ 

10.  Simultaneity  of  responses _ 

11.  Number  of  responses _ 

12.  Number  of  procedural  steps _ 

13.  Feedback _ 

14.  Degree  of  muscular  effort _ 

15.  Operator  control  over  stimulus _ 

16.  Regularity  of  stimulus  occurrence 

17.  Stimulus  duration _ 

18.  Variability  of  stimulus  location _ 


99 


Preceding  page  blank 


*1.  NUMBER  OF  OUTPUT  UNITS  (UNIT) 

The  entire  purpose  of  the  task  is  to  create  output  units.  An  output  unit 
is  the  end  product  resulting  from  the  task.  Output  units  can  take  different 
forms.  For  example,  sometimes  the  output  unit  is  a  physical  object  as¬ 
sembled  from  several  parts.  It  may  also  take  the  form  of  a  relationship 
between  two  or  more  things,  e.g. ,  drive  three  car-lengths  behind  the  car  in 
front  of  you.  An  output  unit  might  also  be  a  destination,  e.  g.  ,  run  from  here 
to  the  corner,  with  the  corner  being  the  destination. 

First,  identify  what  the  output  unit(s)  is  in  the  present  task.  Now,  count 
the  number  of  such  output  units  that  someone  performing  this  task  is  supposed 
to  produce.  Use  the  designation  AMAP  (As  many  as  possible)  where  no  actual 
limit  exists. 

2.  DURATION  FOR  WHICH  AN  OUTPUT  UNIT  IS  MAINTAINED  (DURA) 

Once  the  operator  has  produced  an  output  unit  he  may  be  required  to  maintain 
or  continue  it  for  one  of  several  time  periods.  For  example,  it  can  be  maintained 
for  as  long  as  possible.  Another  alternative  is  that  completing  one  output  unit  is 
a  signal  to  leave  it  and  go  on  to  produce  the  next  output  unit.  Or,  having  produced 
the  output  unit,  performance  ends. 

Choose  which  of  the  following  alternatives  applies  here: 

1)  Maintain  unit  as  long  as  possible. 

2)  Maintain  unit  as  long  as  possible  but  continue  to  produce  additional 
units. 

i 

3)  Leave  unit  and  go  on  to  produce  next  unit.  t 

4)  Production  of  unit  signals  end  of  task. 

3.  NUMBER  OF  ELEMENTS  PER  OUTPUT  UNIT  (ELEM) 

One  way  of  describing  an  output  unit  is  in  terms  of  the  number  of  elements 
involved  in  its  production.  By  elements  we  mean  the  parts  or  components  which 
comprise  the  output  unit.  In  an  addition  problem,  for  example,  the  numbers  to 
be  added  are  the  elements  which  comprise  the  output  unit.  In  a  more  physical 
task,  the  elements  could  be  parts  to  be  assembled  or  apparatus  to  be  manipulated. 

Count  the  number  of  different  displays  and  controls  which  are  manipulated 
in  producing  a  single  output  unit. 


100 


"4.  WORK  LOAD  (LOAD) 


Woik  load  refers  to  the  number  of  output  units  to  be  produced  relative 
to  the  time  allowed  for  their  production.  We  are  interested  in  the  ratio  of 
the  number  of  output  units  per  unit  time,  e.  g.  ,  make  5  widgets  in  10  minutes  = 
1  widget  produced  every  two  minutes. 

However,  there  are  those  tasks  in  which  the  goal  is  to  maintain  a  situa¬ 
tion  rather  than  to  produce  multiple  output  units.  For  example,  a  driving 
task  where  you  are  to  stay  within  40  feet  of  the  vehicle  ahead  of  you.  For 
these  types  of  tasks,  work  load  refers  to  the  length  of  time  for  which  main¬ 
tenance  is  required.  The  longer  the  maintenance  period,  the  higher  the 
work  load. 

Therefore,  rating  a  task  in  terms  of  work  load  resolves  to  answering 
one  of  two  questions: 

1)  How  much  has  to  be  produced  in  what  amount  of  time;  or 

2)  How  long  does  this  situation  have  to  be  maintained  or  continued? 


Definitions 

High  work  load  -  as  many  -> 

output  units  as  possible  are  to  be* 
produced  in  a  fixed  period  of  time; 
a  relatively  large  number  of  output 
units  is  to  be  produced  in  a  rela¬ 
tively  short  period  of  time;  an 
output  unit  is  to  be  maintained  6 
fora  relatively  long  time  or  for  as 
long  as  possible. 

5 


Moderate  work  load  -  a  moderate 
number  of  output  units  is  to  be  Q 
produced  in  a  reasonable  period 
of  time;  an  output  unit  is  to  be 
maintained  for  a  moderate  period 
of  time  relative  to  other  possible 
periods. 


Examples 

•  Drive  as  many  nails  as  possible 
in  five  minutes. 

•  Maintain  a  stimulus -control 
relationship  as  long  as  possible. 


•  Drive  ten  nails  in  five  minutes. 

♦  Maintain  a  stimulus -control 
relationship  for  three  minutes. 


2  -J 


Low  work  load  -  a  small  number 
of  output  units  is  to  be  produced 
in  a  relatively  long  period  of  time; 
an  output  unit  is  to  be  maintained!  <— 
for  a  relatively  short  period  of  time. 


f  Drive  these  two  nails  in  the  next 
five  minutes. 

#  Sum  the  following  five  numbers. 

•  Maintain  a  stimulus -control 
relationship  for  30  seconds. 


101 


PRECISION  OF  RESPONSES  (PREC) 


*  6. 


T^sks  may  differ  in  terms  of  how  precise  or  exact  the  operator's 
responses  must  be.  Judge  the  degree  of  precision  involved  in  the  present 
task  by  considering  the  most  precise  response  made  in  producing  an  output 
unit. 


Definitions 


Examples 


K'gh  degree  of  precision  -  because 
of  small  targets,  fine  scales,  7 

sensitive  controls,  etc.  the  subject 
must  make  responses  which  are 
extremely  precise. 

6 


•  Using  a  chemical  balance  (scales) 
determine  the  weight  of  the  following 
objects  to  the  nearest  microgram. 

•  Replace  the  mainspring  in  this 
wrist-watch. 


5 


Moderate  precision  -  relative  to 
the  definitions  above  or  below,  a  4 
moderate  degree  of  precision 
must  accompany  subject's  responses. 


•  Using  our  pencil,  trace  this  maze. 


3  -i 


2 


Low  degree  of  precision-because 
of  large  targets,  gross  scales,  in-  1 
sensitive  controls,  etc.  the  subject 
can  make  responses  which  are  gross 
or  imprecise. 


•  Do  twenty  push-ups. 

•  Sort  the  oranges  and  lemons  into 
two  piles. 


102 


RESPONSE  RATE  (RATE) 


R. 


Responses  can  be  made  at  different  rates.  That  is,  the  frequency  with 
which  responses  must  be  made  can  vary  from  task  to  task.  For  example, 
you  would  have  a  higher  rate  of  responding  if  you  were  playing  a  singles  game 
of  tennis  than  if  you  were  playing  chess.  The  responses  would  come  more 
frequently  in  the  first  case  than  in  the  second.  You  are  to  judge  what  rate 
of  responding  is  called  for  in  producing  one  output  unit  in  the  task  being  judged. 


Definitions 


Examples 


High  rate  of  responding  -  many 
responses  are  required  per  7 

unit  time.  In  the  extreme  case 
responses  become  continuous. 


•  Fire  20  rounds  for  effect  as 
quickly  as  possible. 

•  Complete  this  jig-saw  puzzle  as 
fast  as  you  can. 

0  Track  this  target. 


5 


Moderate  rate  of  responding  -  a  4 
moderate  number  of  responses 
are  required  per  unit  time. 


0  Fire  20  rounds.  Fire  rapidly  but 
also  be  as  accurate  as  you  can. 


3 


2 


Low  rate  of  responding  -  few 
responses  are  emitted  per  unit 
time.  Responses  are  often  sin¬ 
gular. 


1 


103 


•  Add  the  following  numbers.  Take 
all  the  time  you  need. 


T.  TUTORIAL  DEPENDENCY  OF  RESPONSES  (TUDE) 


Consider  again  the  number  of  steps  (responses)  involved  in  producing 
one  output  unit.  The  steps  may  be  described  in  terms  of  the  dependency 
among  them;  dependency  concerns  the  extent  to  which  the  steps  must  be 
done  in  some  specified  order.  For  example,  dependency  exists  between 
steps  A  and  B  if  step  B  cannot  be  accomplished  without  step  A  being  done 
first.  Note  Procedures  which  have  only  one  step  are  automatically  low 
in  dependency.  Tutorial  dependency  refers  to  a  dependency  imposed 

as  part  of  the  training  in  an  effort  to  standardize  trainee  operations. 


Definitions 

High  dependency  among  steps  - 
each  step  in  the  procedure  is  * 
completely  dependent  upon  the 
preceding  procedural  step. 
Systematic  ordering  of  steps  is  at 
a  maximum. 

6 


5 


Moderate  dependency  among  steps  - 
in  the  total  number  of  steps  com-^ 
prising  the  procedure,  approx¬ 
imately  50%  are  dependent  upon 
preceding  steps. 

3 


2 


Low  dependency  among  steps  - 
procedura'  steps  are  not  organized 

in  any  particular  sequence.  Step  \ 
"A"may  precede  "B"  or  "B"  may 
precede  "A'1.  Procedures  having  one 
step  are  low  in  dependency. 


Examples 

#  Using  the  combination  you've  been 
given,  open  the  safe. 

•  Dial  this  telephone  number. 


0  Using  colored  blocks,  stack  them  into 
columns  four  blocks  high.  Do  this  in 
the  order  red  and  green  for  the  first 
two  blocks.  The  remaining  blocks  may 
be  of  any  color. 


0  Using  colored  blocks,  stack  them  into 
columns  four  blocks  high.  Order  of 
color  is  unimportant. 


104 


8. 


NATURAL  DEPENDENCY  OF  RESPONSES  (NADE) 


Consider  again  the  number  of  steps  (responses)  involved  in  producing 
one  output  unit.  The  .  teps  may  be  described  in  terms  of  the  dependency 
among  them;  dependency  concerns  the  extent  to  which  the  steps  must  be 
done  in  som,e  specified  order.  For  example,  dependency  exists  between 
Steps  A  and  B  if  step  B  cannot  be  accomplished  without  step  A  being  done 
first.  Note;  Procedures  which  have  only  one  step  are  automatically  low 
in  dependency.  Natural  dependency  refers  to  dependency  that  is  inherent 
in  the  operation  of  the  equipment. 


Definitions 

High  dependency  among  steps  - 
each  step  in  the  procedure  is 
completely  dependent  upon  the 
preceding  procedural  step. 
Systematic  ordering  of  steps  is  at 
a  maximum. 


7 


5  -J 


Moderate  dependency  among  steps  ■ 
in  the  total  number  of  steps  com-** 
prising  the  procedure,  approx¬ 
imately  50%  are  dependent  upon 
preceding  steps. 


3  -1 


Examples 


,  Using  the  combination  you've  been 
given,  open  the  safe. 

•  Dial  this  -  lephone  number. 


Using  colored  blocks,  stack  them  into 
columns  four  blocks  high.  Do  this  in 
the  order  red  and  green  for  the  first 
two  blocks.  The  remaining  blocks  may 
be  of  any  color. 


2  -J 


Low  dependency  among  steps  - 
procedural  steps  are  not  organized 
in  any  particular  sequence.  Step  J 
"A"may  precede  "B"  or  "B"  may 
precede  "A".  Procedures  havirj  one 
•  tep  are  low  in  dependency. 


.J 


0  Using  colored  blocks,  stack  them  into 
columns  four  blocks  high.  Order  of 
color  is  unimportant. 


105 


9.  OPERATOR  CONTROL  OF  THE  RESPONSE  (OCOR) 


Given  the  occurrence  of  the  stimulus,  what  degree  of  control  does  the 
operator  have  over  when  he  must  initiate  his  response. 


Definition^ 


Examples 


Full  operator  control  -  the 
operator  is  the  sole  deter¬ 
miner  of  when  the  response 
will  be  made. 


6  -J 


9  Playing  a  game  of  chess  by  yourself 
where  you  play  both  sides  and  there 
is  no  time  limit  for  responding. 


5  -J 


Partial  operator  control  -  the 
response  must  be  made  within  4  — | 

a  reasonable  time  after  the 
stimulus  occurs  but  the  operator 
determines  when  within  the  interval 
the  response  will  take  place. 


3  — I 


9  The  traffic  light  turns  red  when  you 
are  500  yards  from  it;  you  have 
options  as  to  when  you  will  hit  the 
brake. 


No  operator  control  -  the 
operator  must  respond  as  soon 
as  the  stimulus  occurs. 


•  Typical  reaction  time  task.  When  the 
light  comes  on,  push  this  button  as 
fast  as  you  can. 


106 


*10.  SIMULTANEITY  OF  RESPONSES  (SIMU) 


The  responses  which  the  operator  makes  in  producing  one  output  unit 
may  involve  one  or  more  effectors  (e.  g.  ,  hand,  foot,  arm,  voice,  etc.  ). 
Depending  upon  the  task,  these  effectors  may  or  may  not  be  used  simultan¬ 
eously.  For  example,  both  hards  (two  effectors)  are  used  simultaneously 
in  playing  a  piano. 

How  many  effectors  are  being  used  simultaneously  during  the  present 

task? 


zero 


two 


three 


lour 


*11.  NUMBER  OF  RESPONSES  (NO.  R) 

Earlier  we  were  concerned  about  the  number  of  elements,  i.  e.  ,  ob¬ 
jects  or  components,  in  volved  in  the  production  of  one  output  unit.  Now 
we  want  to  consider  the  number  of  responses  needed  to  produce  one  out¬ 
put  unit.  There  isn't  a  necessary  one-to-ene  relationship  between  objects 
and  responses. 

Count  the  number  of  responses  or  steps  involved  in  producing  one 
output  unit  for  the  present  task.  Enter  this  number  on  the  answer  sheet. 


107 


'"12.  NUMBER  OF  PROCEDURAL  STEPS 

Earlier  we  were  concerned  about  the  number  of  elements,  i.  e.  ,  objects 
or  components,  involved  in  the  production  of  one  output  unit.  Now  we  want 
to  consider  the  number  of  procedural  steps  (responses)  needed  to  produce 
one  output  unit.  There  isn’t  a  necessary  one-to-one  relationship  between  ob¬ 
jects  and  responses. 

Consider  the  number  of  responses  or  steps  involved  in  producing  one 
output  unit  for  the  present  task.  Rate  this  task  on  the  scale  below. 


Definitions 


Examples 


Large  number  of  steps  -  the 
procedure  consists  of  a  large 
number  of  constituent  steps. 


6 


5 


Medium  number  of  steps  -  the 
procedure  contains  a  medium 
number  of  steps  relative  to 
other  procedures. 

3 


2 


Small  number  of  steps  -  the  pro¬ 
cedure  consists  of  few  steps.  At 
a  minimum,  only  one  step  may 
be  necessary. 


•  Build  a  color  TV  kit  following  the 
enclosed  instructions. 


2 

*  Solve  the  equation  X  -  4X  4  =0 

•  Type  the  following  business  letter. 


•  Open  this  combination  lock 
(32L-43R-10L). 

•  Press  the  button  whenever  the  light 
comes  on. 


108 


13.  FEEDBACK  (FEED) 


For  present  purposes  the  term  FEEDBACK  refers  to  information 
which  an  operator  may  get  about  the  correctness  of  a  response.  In  this 
scale  we  are  interested  in  how  quickly  feedback  occurs  once  the  response 
is  made. 


Definitions 


Examples 


Immediate  feedback  -  7 

Operator  knows  whether  the 
response  was  correct  as  soon 
as  it  was  completed. 


6 


®  Finding  th-e  correct  switch  to 
turn  on  a  light. 


5 


Delayed  feedback  -  operator 
receives  feedback  regarding  4 
his  responses  after  entire 
task  is  completed. 


0  Opening  a  combination  lock  having 
five  numbers. 


3 


2 


No  feedback  provided  -  ,  I  •  Student  takes  a  mid-term  exam  but  is 

Operator  never  receives  feedback  not  told  what  grade  he  got. 


109 


I 


14.  degree  of  MUSCULAR  EFFORT  INVOLVED  (MUSC) 

This  dimension  considers  the  amount  of  muscular  effort  required  to 
perform  the  task.  Examine  the  task  and  identify  the  most  physically 
strenuous  part  of  it.  Rate  this  part  on  the  scale  below. 


Definitions 


Examples 


High  amount  of  muscular  effort- 
response(s)  require  a  high 
degree  of  muscular  involvement. 


•  Do  40  push  ups. 

•  Lift  the  heaviest  weight  possible. 


6 


5 


Moderate  amount  of  muscular 
effort  required  for  the  response(s) 


i 

I 

•  Tighten  nuts  on  bolts  securely  witn 
a  wrench. 


3 


2 


Low  amount  of  muscular  effort 
required 


•  Solder  two  wires  together 

•  Add  numbers  and  report  the 
sum  aloud. 


110 


15.  OPERATOR  CONTROL  OF  THE  STIMULUS  (OCOS) 

What  degree  of  control  does  the  operator  have  over  either  the  occurrence 
or  relevance  of  the  stimulus? 


Definitions 


Examples 


Full  operator  control  -  the 
operator  is  the  sole  determiner 
of  when  the  stimulus  occurs  or 
when  it  becomes  relevant. 


•  Shooting  skeet;  shooter  determines 
when  "bird"  appears. 


6 


5 


•  Controlling  the  speed  of  your  car  in 
approaching  a  traffic  light  in  order 
to  have  a  green  light  when  you  get  to  the 
intersection. 

3  -i 


2  -J 


Partial  operator  control  -  the 
operator  has  some  control 
over  when  the  stimulus  either 
occurs  or  becomes  relevant. 


No  operator  control  -  the  operator 
has  no  control  over  when  the  j 

stimulus  occurs  or  when  it  becomes 
relevant. 


•  Waiting  for  the  telephone  to  ring. 


Ill 


16.  REGULARITY  OF  STIMULUS  OCCURRENCE  (ROSO) 

Consider  the  critical  stimulus  or  stimulus  complex  to  which  the 
operator  must  attend.  Does  it  occur  at  regular  (i.  e. ,  equal)  intervals 
or  at  irregular  intervals.  Treat  regular  intervals  and  constant  pre¬ 
sence  of  the  stimulus  as  equivalent  conditions. 


Rate  the  present  task  on  tins  dimension. 


Definitions 


i 


Examples 


High  regularity  -  stimulus 
occurs  at  regular  intervals  or 
is  constantly  present. 


6 


5 


Medium  regularity  -  stimulus 
occurs  at  irregular  (unequal) 
intervals  but  there  is  a  pattern 
of  occurrence. 


3 


2 


Low  regularity  -  stimulus  oc¬ 
curs  at  very  irregular  (almost 
random)  intervals. 


•  Cars  coVning  along  an  assembly  line. 

> 

•  Looking  <.t  a  photograph  of  an  object. 


•  Receiving  morse  code. 


•  Detecting  random  signals  on  a  CRT 
di  splay. 


112 


17.  STIMULUS  OR  STIMULUS  COMPLEX  DURATION  (SDUR) 


Consider  the  critical  stimulus  or  stimulus -complex  to  which  the 
operator  must  attend  in  performing  the  task.  Relative  to  the  total  task 
time,  for  how  long  a  duration  is  the  stimulus  or  stimulus -complex  present 
during  the  task? 


Definitions 


Examples 


Long  duration  -  stimulus  would 
remain  indefinitely. 


6 


5 


Medium  duration  -  stimulus  4 
remains  present  until  changed 
(spatially,  temporally,  etc.  ) 
by  the  response  made  to  it. 

3 


2 


Short  duration  -  stimulus  ceases 
prior  to  response  being  made  to 
it. 


•  Drawing  a  picture  by  observing 
a  model  of  the  object  being  drawn. 


#  Red  light  goes  out  when  operator 
pushes  a  button. 


•  Operator  must  identify  words  or 
targets  presented  tachistoscopically. 


18.  VARIABILITY  OF  STIMULUS  LOCATION  (VARS) 

Judge  the  degree  to  which  the  physical  location  of  the  stimulus  or 
stimulus  complex  is  predictable  over  task  time. 


Definitions 


High  predictability  -  stimulus 
location  remains  basically 
unchanged. 


Examples 


7 


•  Stimulus  is  a  red  light  located  on  a 
display  panel. 


6 


5 


Medium  predictability  - 
location  changes  but  in  a 
known  manner  or  pattern. 


•  Visually  following  an  arrow  in 
flight  toward  a  target. 


3 


Low-predictability  -  location 
changes  in  an  almost  random 
j  fashion. 


0  Predicting  which  leaf  will  fall 
from  a  tree  next. 


114 


APPENDIX  6 

TASKS  USED  IN  2-JUDGE  STUDY 

The  judges  in  this  study  rated  tasks  appearing  in  a  number  of 
published  articles.  In  each  case,  their  attention  was  directed 
toward  the  method  section,  focusing  on  the  apparatus  and  instruc¬ 
tions. 

A  list  of  the  references  so  viewed  is  provided  in  lieu  of 
descriptions  of  the  tasks  themselves. 


115 


REFERENCES  USED  IN  SECOND  POST -DICTION  STUDY 


Adams,  J.  A.  Psychomotor  performance  as  a  function  of  intertrial 
rest  interval.  Journal  of  Experimental  Psychology,  1954,  48, 
13!  -135. 

A.rcher,  E.  J.  ,  Kent,  G.  W,  ,  &  Mote,  F.  A.  Effect  of  long-term 

practice  and  time -on-target  information  feedback  on  a  complex 
tracking  task.  Journal  of  Experimental  Psychology,  1956,  51, 

J 03 -112,  “  " 

Bilodeau,  E.  A.  Some  effects  of  various  degrees  of  supplemental 

information  given  at  two  levels  of  practice  upon  the  acquisition 
of  a  complex  motor  skill.  Research  Bulletin  52-15,  April 
1952,  Human  Resources  Research  Center,  Lackland  Air  Force 
Base,  San  Antonio,  Texas. 

Birren,  J.  E.  ,  &  Fisher,  M.  B.  Standardization  of  two  tests  of 

hand-eye  coordination:  A  two-hand  complex  tapping  test  and 
a  rotary  pursuit  test.  Research  Project  X-293,  Report  No.  6, 
1945,  NMRI,  Bethesda,  Maryland. 

Briggs,  G.  E.  ,  Fitts,  P.  M.  ,  &  Bahrick,  H.  P.  Learning  and 

performance  in  a  complex,  tracking  task  as  a  function  of  visual 
noise.  Research  Report  AFPTRC-TN-56-67,  June  1956,  Air 
Force  Personnel  and  Training  Research  Center,  Lackland 
Air  Force  Base,  Texas. 

Brown,  C.  W.  ,  Ghiselli,  E.  E.  ,  Jarrett,  R.  F.  ,  Minium,  E.  W.  ,  & 
U'Ren,  R.,  M.  Comparison  of  aircraft  controls  for  prone  and 
seated  position  in  three-dimensional  pursuit  task.  AF  Techni¬ 
cal  Report  No.  5956,  October  1949,  U.  S.  Air  Force  Air 
Materiel  Command,  Wright-Patter son  Air  Force  Base,  Dayton, 
Ohio. 

Cook,  B.  S.  ,  &  Hilgard,  E.  R.  Distributed  practice  in  motor 
learning:  Progressively  increasing  and  decreasing  rests. 
Journal  of  Experimental  Psychology,  1 9  ‘  9 ,  39.,  169-172. 

Dore,  L.  R.  ,  &  Hilgard,  E.  R.  Spaced  practice  and  maturation 
hypothesis.  Journal  of  Psychology,  1937,  4,  245-259. 

Fleishman,  E.  A.  Unpublished  data  on  two-hand  coordinator. 

Fleishman,  E.  A.  ,  &  Rich,  S.  Role  of  kinesthetic  and  special -visual 
abilities  in  perceptual-motor  learning.  Journal  of  Experimental 
Ps ychology,  1963,  66,  6-1  1. 


117 


Preceding  page  blank 


Gagne,  R.  M.  ,  &  Bilodeau,  E.  A.  The  effects  of  target  size  variation 
on  skill  acquisition.  Research  Bulletin  AFPTRC-TR-54-5, 

April  1954,  Air  Force  Personnel  and  Training  Research  Center, 
Lackland  Air  Force  Base,  San  Antonio,  Texas. 

Goldstein,  M.  ,  &  Rittenhouse,  C,  H.  The  effects  of  practice  with 
triggering  omitted  on  performance  of  the  total  pedestal  sight 
gunnery  task.  Technical  Report  53-9,  May  1953,  Human  Re¬ 
sources  Research  Center,  Lackland  Air  Force  Base,  San 
Antonio,  Texas. 

Howland,  D.  ,  &  Merrill,  E.  N.  The  effect  of  physical  constants  of 
a  control  on  tracking  performance.  Journal  cf  Experimental 
Psychology,  1953,  46,  353-360. 

Lewis,  D.  ,  &  Shephard,  A.  H.  Devices  for  studying  associative 

interference  in  psychomotor  performance.  IV.  The  turret  pur¬ 
suit  apparatus.  Journal  of  Psychology,  1950,  2 9,  '73-182. 

Lincoln,  R.  S.  Learning  and  retaining  a  rate  of  movement  with  the 
aid  of  kinesthetic  and  verbal  cues.  Journal  of  Experimental 
Psychology,  1956,  5 1 ,  199-204. 

Noble,  C.  E.  An  attempt  to  manipulate  incentive  motivation  in  a 

continuous  tracking  task.  Research  Bulletin  AFPTRC-TR-54- 
43,  October  1954,  Air  Force  Personnel  and  Training  Research 
Center,  Lackland  Air  Force  Base,  San  Antonio,  Texas. 

Reynolds,  B.  ,  &  Adams,  J.  A.  Effect  of  distribution  and  shift  in 
distribution  of  practice  within  a  single  training  session. 

Journal  of  Experimental  Psychology,  1953,  46p  137-145. 

Reynolds,  B.  ,  &  Bilodeau,  I.  M.  Acquisition  and  retent.' on  of  three 
psychomotor  tests  as  a  function  of  distribution  of  practice 
during  acquisition.  USAF  Human  Resources  Research  and 
Development,  Lackland  Air  Force  Base,  San  Antonio,  Texas. 

Spieth,  W.  An  investigation  of  individual  susceptibility  to  inter¬ 
ference  in  the  performance  of  three  psychomotor  tasks.  Re¬ 
search  Bulletin  53-8,  April  1953,  Human  Resources  Research 
Center,  Lackland  Air  Force  Base,  San  Antonio,  Texas.  * 


''''This  study  yielded  two  groups  and,  hence,  two  sets  of  learning  data 
for  the  post-diction  study. 


118 


